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CHAPTER 1 

INTRODUCTION 


Turbo codes, representing the most important breakthrough in coding, are able to 
operate near Shannon limit. Extensive research results are being reported about this novel 
technique. The commonly accepted turbo coding is implemented by a system, which 
consists of two parallel concatenated recursive systematic convolutional encoders 
separated by an interleaver [1]. The maximum a posteriori probability (MAP) algorithm 
is applied for decoding because of its improved performance [2]. Since low-rate codes are 
not appropriate for commonly used applications, there is a need to develop high rate 
turbo codes [3]. It has been shown that some high rate codes have very good performance 
but others exhibit poor performance. It is claimed that selection of puncturing patterns 
has considerable influence on the performance [3]. In this report, performance of high 
rate turbo codes is analyzed based on the simulation results. For high rates with normal 
performance, different puncturing patterns have been selected in the simulations and their 
performance is compared. For special high rate codes with poor performance, an 
alternative puncturing algorithm is developed which shows significant improvement in 
the performance. 

Iterative decoding of block codes has gained more and more interest recently. 
Log- likelihood algorithm is used in the decoding and the “symbol by symbol” MAP 
decoding is the optimal method [4]. The construction of trellis for block codes is the first 
and a key step in the decoding [5]. By the constructed trellis for each information bit, an 
extrinsic value can be calculated by using MAP algorithm, which is then used as the a 
priori value of the next iteration. The procedures for trellis construction, extrinsic value 
calculation, and iterative algorithm will be discussed in detail in this report. 

Before the discussion of turbo codes and iterative block decoding, a review of 
coding, block codes and convolutional codes is given in this chapter. Turbo convolutional 
codes are discussed in Chapter 2 and iterative block decoding is introduced in Chapter 3. 
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1.1 Coding 

A cost-effective system transmits information at a rate and a level of reliability 
that are acceptable. Two parameters are important in the design of a digital 
communication system. One parameter is the signal energy per bit to noise power spectral 
density ratio, Eb/N 0 . The second parameter is the bandwidth. Practical considerations 
place a limit on the value of available Eb/No ; it’s followed that under some conditions it is 
impossible to provide acceptable quality because of inadequate Eb/N 0 . 

Channel coding is used to provide for the reliable transmission of the digital 
information over the channel. For a fixed value of Eb/No, coding is a good and practical 
way to improve the data quality. For fixed error rate, with the help of coding, we can 
decrease the requirement of the Eb/No, which will in turn decrease the required 
transmitted power. 

Coding introduces the redundancy into the message based on a prescribed rule to 
detect the error and to correct the error. Transmission process with coding is shown in Fig 
1.1. Channel encoder accepts message bits and adds redundancy according to the coding 
rule. Channel decoder exploits the redundancy to decide which message bit was 
transmitted. 

The error, effect of the channel impairment, is minimized by coding. However, 
not all of the errors can be detected and corrected by coding. The correction capacity 
depends on the similarity between the acceptable and the unacceptable code words. Block 
coding and convolutional coding are the two most important and widely used methods in 
coding. 



Fig 1 . 1 Transmission with coding 
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1.2 Block coding and decoding 

1.2.1 Definition of block codes 

Codes formed by taking a block of k information bits and the added m redundant 
bits to form a code word of n = k + m bits are called block codes. These can be 
represented as (n , k) codes. The n-bit codeword consisting of k information bits and m 
redundant bits is called systematic code. The code where k information bits are not 
explicitly present in the codeword is called nonsystematic code. 

The k information bits represent the 2* equally likely messages. The total 
number of possible n-bit codewords is 2" . There are 2" —2* /i-bit codewords that do not 
represent possible messages. 

If we want to maintain the rate of information transmission, the transmitting rate 
should increase after the coding by R c / Rb = n / k, where R c „ Rb are the coded and 
uncoded bit rates respectively. 

1.2.2 Block coding 

Assume that the uncoded word is u= [ui u 2 u 3 utj. The generation of a block 

code starts with a selection of the number m of parity bits to be added. Specify an H 
matrix, 
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which is made up of an mx k sub-matrix h and an mxm identity submatrix. Each h,j in 
the matrix is either 1 or 0. Assume the coded words as v= [ui 112 ... m* pi P 2 ... p n 7. where 
v and H should satisfy the equation, 

Hv T =0 (1.2) 

To generate a code word v from u, we form a generator matrix G. 
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G consists of an identity submatrix of dimension k x k, and a second sub-matrix, which is 
the transpose of h (one of the sub-matrics of H). The codeword v corresponding to each 
uncoded word u is 

v = uG (1.4) 

For each u generated, equation (1.2) should be satisfied. 

1.2.3 Block decoding 

Decoding the received codewords can be done by evaluating the correlation of the 
received word with all possible words, and the one that exhibits the closest correlation is 
determined as the transmitted codeword. This method is not efficient for codewords of 
large length. Block coding provides an alternate way to reduce the complexity of 
decoding. 

Name the received message as r. It may or may not be the same as the transmitted 
codeword v. We can determine if r is equal to v by using the equation. 

Hr' = 0 (1.5) 

If equation (1.5) can not be satisfied, that means there are at least one or even more bits in 
error. If the equation can be satisfied, we can not absolutely be certain that r is correct 
and equal to v, because there’s the possibility that several errors occurred in the 
transmission and they happened to change the transmitted codeword into another possible 
codeword. When r ^v, we assume 

r=v + e (1.6) 

where e is the error pattern. Thus we will also have 

r'=v'+e T (1.7) 

The appearance of a 1 in the error pattern e indicates an error in the corresponding bit 
position and 0 indicates no error has been made. 
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We can begin the decoding from the evaluation of syndrome s of the received 
codeword r. Since Hv T equals 0, 

s = Hr T =H(v T +e T ) = Hv T +He T =He T (1.8) 

If s is not equal to zero, that means there are one or more errors, s equals to zero means 
either there is no error or the error pattern is equal to a valid codeword. If s is not zero, 
we can calculate s by the equation 

s = Hr x (1.9) 

For single error case, we can compare s with each column of H. If the /,* column of H is 
identical to s, then the i,h bit of the codeword is in error. For more than one error case, we 
must solve (1.8) and identify the error patterns. The error pattern with fewest errors 
should be selected. 

The number of the possible error patterns is 2*. Thus the number of error 
patterns will be very large with the increase of k. However, there are a maximum number 
of errors that a code can correct, thus we can ignore the possibility of errors larger than 
that number since we can not correct them. 

1.2.4 Common block codes 
1.2.4.1 Single parity-check bit codes 

Single parity-check bit coding is the simplest method in block codes. The theory 
of this method is as follows: 

1) Adding a redundant bit p; at the end of the information bits, so 

n = k + m = k + ] (1.10) 

2) If the information bits have odd number of l’s, or equivalently, the addition of the 
information bits equals to 1, p; is set to be 1 . 

3) Otherwise, p/ is set to be 0. 

This method keeps an even number of l’s in the transmitted message. If the 
received message shows odd number of l’s, then error must have occurred in the 
transmission. This method works well only under the condition that the probability of 
more than 1 errors to occur in a codeword is quite low. Another shortcoming of this 
method is that it can only detect the error but can not detect which bit is in error. It 
follows that this method can not be used to correct the error. 
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1 .2.4.2 Simple repetition codes 

This method repeats a binary bit 2t+l times. Since k= land m = 2t, 

n = k +m = 1 + 2t (1.11) 

Repeated code with length 2t + 1 can correct as much as t errors. But it will need 
significant bandwidth because the rate is changed to l/(2t+l). Therefore such codes are 
inefficient. 

1 .2.4.3 Hamming codes 

Assume d as the distance between each pair of codewords. Hamming distance 
(dmi„) is defined as the minimum value of d. The greatest likelihood of confusion between 
words will be encountered for a codeword pair where d is the minimum. So the Hamming 
distance establishes the upper limit of the effectiveness of a code. 

In Hamming code, we have 


Block length n = 2 m - 1 

(1.12) 

Number of message bits k = 2 m -l-m 

(1.13) 

Number of parity bits n-k = m 

(1.14) 


where m > 3. If d^ = 2t + 1, then the errors smaller than t bits can be corrected. In a 
(7,4) Hamming code (m =3), the smallest Hamming weight for nonzero codewords is 3, 
so dmm = 3 and t = /, it follows that single error can be corrected. 

The parity check matrix H has m rows and n columns. Each column is unique and 
no column consists of all zeros. To form systematic code, all the columns are arranged to 
separate submatrix h and the identity submatrix /. 

1.2.4.4 Cyclic codes 

Cyclic codes form a subclass of linear block codes and they have the advantage 
that they are easily encoded and decoded. Indeed, many of the important linear block 
codes are either cyclic codes or closely related to cyclic codes. Cyclic codes have two 
fundamental properties: 

1) Linear property: The sum of two code words is also a codeword. 

2) Cyclic property: Cyclic shift of codewords forms other valid codewords. Codewords 
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can be written in a cycle. There are 2* -1 starting points to read the code, each 
related to the other with a shift. 

Hamming code is an example of the cyclic code. Assume a (7,4) Hamming code. 
The number of information bits k- 4. It follows that there are a total of 16 codewords for 
the Hamming code. Two groups of seven of them are precisely the cyclic-shift related 
words. The last two codewords, other than these fourteen, are 0000000 and 1111111. For 
these two codes, any cyclic shift forms the same codeword. 

1 .2.4.5 Other block codes 

Some other types of block codes include Hadamard code, extended code, Golay 
code, and BCH code. 

13 Convolutional coding and decoding 
13.1 Definition of convolutional codes 

In block coding, the encoder generates n- bit codeword from a Jt-bit message. The 
code words are produced on block-by-block basis. So there must be a buffer to store the 
message before the encoding is done. In convolutional coding, the use of buffer is not 
needed. A convolutional encoder operates on the incoming message sequence 
continuously in a serial manner. An example of a typical convolutional encoder is shown 
in Fig 1.2, in which we see that a convolutional code is generated by combining the 
outputs of an M - stage shift register with the employment of N A binary adders. 



Fig 1.2 An example of a typical convolutional encoder 
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As shown in Fig 1.2, we assume that the length of the message is L and the 
system consists of an M-stage shift register and N* modulo-2 adders, the code rate will be 


L 

N a (L+M) 


(1.15) 


Normally, L is much larger than M, so the code rate can be simplified as 1/Na. 

Constraint length is defined as the number of shifts over which message bit can 
influence the encoder output. The constraint length K equals M +1 in convolutional 
coding. The structural properties of a convolutional encoder are portrayed in graphical 
form by using three equivalent diagrams: code tree, code trellis, and state transition 
diagram. 


1 .3.1.1 Code tree 

Fig 1.3 shows the first several stages of a code tree. Each branch of a tree 
represents an input symbol. Normally, input 0 specifies the upper branch in a tree, input 1 
specifies the lower branch. A specific path in the tree is traced from left to right in 
accordance with the input sequence. The corresponding coded symbols on the branches 
of that path constitute the sequence supplied by the encoder to the discrete channel input. 
The tree becomes repetitive after k branches. The nodes become identical because the 
first bit has been shifted out of the register. 
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Fig 1 .3 The structure of code tree 
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1.3. 1.2 Code Trellis 

The code tree can be transformed into a new form, called trellis. Trellis is a tree- 
like structure with remerged branches. 

As in a tree, each input sequence corresponds to a specific path through the trellis. 
However, a trellis is more instructive than a tree in that it brings out explicitly the fact 
that the associated convolutional encoder is a finite-state machine. 

State is defined as the most recent M message bits shifted into the encoder 

register. The state of this encoder can assume any one of the 1 K ~ X possible values. The 
trellis contains L + K levels which are called as depth of the trellis. Trellis is preferred 
than tree because the number of nodes at any level of the trellis doesn’t continue to grow 
as the number of incoming message bits increases. Fig 1.4 shows part of a trellis between 
depth i and depth i+1. The solid lines represent inputs of 0, and the dashed lines represent 
inputs of 1 . 



Fig 1 .4 A part of trellis between depth i and depth i+1 
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13.13 State transition diagram I 

Though it looks very simple, the input- output relation of a convolutional encoder , 

is completely described by its state diagram. i 


s, 



Fig 1.5. The structure of state diagram 

The nodes of the state diagram represent the possible states of the encoder. Each 

node has 2 M ~ l incoming branches and 2 M_I outgoing branches. The label on each of 
the branches represents the encoder’s output as it moves from one state to another. 

13.2 Convolutional encoding 

The operation of the encoder (Fig 1.2) proceeds as follows: 

1) Assume the shift register is initialized. 

2) The first bit of input data enters in the first register M}. 

3) During the message bit interval, the adder calculates N A outputs. 

4) The next message bit moves to Mi, and the first bit transfers from Mi to M 2 , and 
again, all Na adder outputs are calculated. 

5) This process continues until last bit of the message comes in Mi. 

6) Enough 0’s are added to the end of the message sequence, to allow the whole 
encoding process to be completed as the last bit leaves the last register. 

7) The shift registers are in the original clear condition again. 
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133 Convolutional decoding 

133.1 Maximum likelihood decoding of convolutional codes 

Viterbi algorithm for the decoding of convolutional codes is developed. In the 
development process, first, for the binary symmetric channel (BSC), the maximum- 
likelihood decoder reduces to a minimum (Hamming) distance decoder. Second, the 
trellis representation is used to establish the basic concepts of the Viterbi algorithm. 

Assume v as the input code vector of the channel, and r denotes the corresponding 
received vector. Vector r may differ from vector v if error occurs due to the channel 
noise. However, from the received r, we can estimate v. The decoding rule for choosing 
the estimate of v, given the received vector r, is said to be optimum when the probability 
of decoding error is minimized. So the maximum-likelihood decoding rule for the binary 
symmetric channel is as follows: Choose the estimate x that minimizes the Hamming 
distance between r and v. Thus for the binary symmetric channel, the maximum- 
likelihood decoder reduces to a minimum distance decoder. 

Thus we may decode a convolutional code by choosing a path in the code tree 
whose coded sequence differs from the received sequence in the fewest number of places. 
We may equally limit our choice to the possible paths in the trellis representation of the 
codes. 

Viterbi algorithm makes sequence of decisions when working through the trellis. 
The algorithm operates by computing a “metric” for every possible path in the trellis. The 

metrics of the 2*"' possible paths entering the node are compared and the one with the 
lower metric is retained. The paths that are retained are called survivors. No more than 

2 k ~ x survivor paths and their metrics will ever be stored. The relatively small list of 
paths is always guaranteed to contain the maximum- likelihood choice. 

The steps can be described as: 

1) Starting at level i = M, compute the metric for the single path entering each state of 
the encoder. Store the survivor and its metric for each state. 

2) Increment the level i by 1 . Compute the metric for all the paths entering each state by 
adding the metric of the incoming branches to the metric of the connecting survivor 
from the previous time unit. For each state, identify the path with the lowest metric 
as the survivor of step 2. Store the survivor and its metric. 
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3) If level i < L +M, repeat step 2, otherwise, stop. 

Viterbi algorithm is a maximum-likelihood decoder, which is optimal for a white 
noise Gaussian channel. 

133.2 Sequential decoding of convolutional codes 

This method is sub-optimal but can avoid the computation of the likelihood, or 
metric, of every path in the trellis, thereby reducing computational complexity and 
allowing the constraint length K to take on very large values. Although sequential 
decoding algorithms are not as good as maximum likelihood decoding algorithms, they 
are computationally efficient for large K. 

Sequential decoding is an intuitive trial-and-error technique for searching out the 
correct path in a code tree. During the course of this search, the decoder moves forward 
and backward in the code tree, one node at a time. The decision to move forward or 
backward is determined by the manner in which the metric of the algorithm varies along 
the path followed by the decoder. 

Several algorithms have been devised for the sequential decoding of 
convolutional codes. Fano algorithm is probably the most important because it has the 
useful feature that it uses very little storage. In Fano algorithm, the decoder moves 
forward and backward, the decision is made by comparing the path’s Fano metric at the 
node with a running threshold maintained by the decoder. 

In addition to the computer requirements for executing the Fano algorithm, the 
decoder contains a buffer to store the received sequence, and a replica of the encoder. 
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CHAPTER 2 

TURBO CODES 


There exists a limiting value of E\/N 0 below which error-free communication is 
impossible at any information rate. This value of Et/N 0 is called as Shannon limit. It’s not 
possible in practice to reach Shannon limit, because it will cause the bandwidth 
requirement and implementation complexity to increase without bound. Shannon’s work 
provided a theoretical proof for the existence of codes that can improve the BER 
performance, or reduce the Et/No required. Our aim in coding and decoding is to get as 
close to the Shannon limit as possible. 

Low BER in high noise environment requires the very complex channel coding 
and decoding schemes. According to Shannon’s theorem, performance of long random 
codes can approach Shannon’s limit. However, long random codes are extremely difficult 
to decode generally. 

Turbo coding, defined as the process of using parallel concatenation in 
conjunction with recursive systematic convolutional codes (RSCC), can produce codes 
with performance close to the Shannon limit. As mentioned above. Shannon limit can be 
reached when decoding large random codes; so in addition to a large minimum distance, 
good codes should have a distance distribution that mimics that of random coding. Turbo 
codes can be designed to generate a weight distribution similar to that of random codes. It 
requires encoding the information as well as the interleaved version of the information 
through a pseudo-random interleaver. The input is presented as blocks of bits. 

Turbo coding is regarded as an important new technology developed in recent 
years because it leads the error control coding techniques finally to get very close to the 
Shannon limit. The performance of turbo codes is much better than all other ever 
designed block or convolutional coding techniques. Turbo codes are so efficient because 
they combine several codes by concatenation, maximize the use of channel information, 
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and have random like distribution of codewords. This approach has the significant error 
correcting capacity even at very low Et/N 0 . 

Though turbo coding is a newly invented error correcting technique, a large 
number of research papers have been published. Turbo coding techniques have 
progressed very rapidly and we can expect several commercial applications in the near 
future. Most of the research work has been on finding the exact explanation of the 
extraordinary performance of turbo codes and providing methods to obtain an even 
further improvement on the performance of turbo codes. In earlier research work, the 
outstanding performance of turbo codes was shown by computer simulations, and some 
theoretical explanations for the simulation results were discussed [1][13][17]. Then, some 
important components and parameters of the coding system became the main 
concentration. Researchers analyzed the theory about generator polynomial [20], 
interleaver [22][23], puncturing pattern [3)[24] and the decoding algorithm [14][15], and 
tried to modify them to achieve even better performance. At the same time, factors such 
as system complexity, execution time, and cost were considered. In the most recent two 
or three years, output weight distribution has been found to decide the performance of 
turbo codes [25]. One area of investigation is the influence of the factors such as 
interleaver and generator polynomial on the output weight of the turbo codes. It is of 
interest to determine a way to achieve the best estimation of the output weight 
distribution when all the system parameters have been decided. An accurate estimation 
will be very helpful for the evaluation of the performance of the system. 

2.1 Concept of turbo codes 

2.1.1 Turbo encoding system 

Turbo codes are encoded by concatenating two RSCC’s using an interleaver. 
When a block of message bits is input to the system, they are encoded directly with one 
of the two RSCC’s, called RSCC1. The same block of message bits are interleaved by a 
pseudo-random interleaver before encoding with another RSCC, called RSCC2. After 
the parity sequences are generated by RSCC’s, they are punctured by puncturerl and 
puncturer2 to increase the code rate. General encoding scheme of turbo codes is shown in 
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Fig 2.1. 



Fig 2. 1 General encoding scheme of turbo codes 


2.1.1.1 Recursive systematic convolution codes (RSCC) 

RSCC’s are constructed from NSCC’s (Non-systematic Convolution Codes) by 
using a feed back loop. They perform better than the best NSCC’s at any SNR, especially 
for high code rate. RSCC’s generator is called as ajb RSCC. ‘a’ and l b' represent octal 
numbers that are converted to binary to represent the connections in a generator circuit 
where a is called as FB(feedback) connection and b as FF(feed-forward) connection. 
Assume the generator matrix of a nonrecursive convolutional code has the form 

(^) = k i (P) 8 2 (P)] » (2.1) 


the equivalent generator matrix of the recursive systematic encoder is 


G n {D)= 1 


*2 OP) 
*i(P> 


( 2 . 2 ) 


where gj(D) and g 2 (D), respectively, represent the feedback and feedforward connections 
of the RSC encoder. The impulse response of a well designed memory M RSCC will 
repeat itself after 2 M ~' bits [2]. 
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2.1. 1.2 Interleaver 

To achieve the best possible performance of turbo codes, using a good interleaver 
is the most important factor. Most of the input sequences, after going through the 
RSCC’s, have a random-like output weight distribution. However, there exist some input 
sequences which cause low output weights. These low weight codewords cause the codes 
to perform poorly. The use of interleaver in the encoding of turbo codes is helpful to 
reduce the number of low output weight codewords generated by the single RSCC. When 
some of the input words produce low weight output codewords through RSCC1, the 
interleaver makes most of them to produce higher weight codewords through RSCC2. 

The interleaver permutes the information bits in an alternative order to make the 
output of RSCC2 (P 2 ) appear to be independent of the information sequence («) and 
therefore random-like, but at the same time, still have a structure that permits decoding. 
Random interleaver is preferred. Size of L-AxA memory is used where the bits to be 
interleaved are stored. These bits are always read in through the rows of the memory, 
then read out by using pseudo-random algorithm to implement interleaving. The 
correlations between these bits are changed in the process. 

The randomness of an interleaver in a turbo-code scheme can be tested by using 
computer simulations. Deinterleaving, the inverse function of interleaving, is 
implemented after the decoding. 

2.1.1 3 Puncturing pattern 

Turbo coding is an important new technology that allows the operation of coded 
modulation schemes near channel capacity on power-limited channels. So, it can be used 
to offer near-capacity performance for deep space and satellite channels. However, it is 
desirable that the performance of turbo-coding schemes be also available for bandlimited 
channels. Trellis-coded M-PSK schemes have been proposed for bandwidth-efficient 
modulation and coding, but the carrier recovery faces the problem that the receiver is 
forced to operate below the recovery loop’s threshold. High rate turbo codes, which are 
both power and bandwidth efficient, may be the solution to this problem. 

Puncturing method is used to achieve higher rate codes. Assume that the original 
code has a rate of Rq. It means that for transmitting each information bit, 1/R 0 bits are 
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transmitted through the channel. Also assuming the puncturing period is N p and in each 
period the puncturing pattern is similar, we can construct a puncture matrix with 
dimension (1/Ro) xN p , with the elements in the matrix either 1 or 0. 1 represents that the 
corresponding bit is retained and 0 represents that the bit is punctured. 

An example to show how to achieve high rate is as follows. Let Ro = 1/2, we 
puncture the code with a period 4 and the 2 x 4 puncture matrix is defined as. 


110 1 
10 10 


(2.3) 


The rate of the original codes is 1/2 because for every 4 information bits, 8 bits are sent 
through the channel. After the puncturing, the rate of the code is changed into 4/5, 
because now, only five bits are sent for the 4 information bits. 

High rate turbo codes are obtained when we use the concept of puncturing on 
turbo codes. In turbo coding, the input data go to the RSCC1 directly and go to the 
RSCC2 after interleaving. RSCC1 and RSCC2 can be identical or not. The systematic 
information bit Ui is transmitted directly. RSCC1 and RSCC2 will produce the parity bits, 
denoted as pn and p 2 i as shown in Fig 2.1. The rate for the RSCC1 and RSCC2 are both 
1/2 when k parity bits are added to the k information bits and transmitted for each of 
them. In this case, for the whole system, 2k parity bits are transmitted through the channel 
together with the k information bits, so the rate for the system is 1/3. Any code rate 
higher than 1/3 for turbo codes is called high rate. The code rate of the system can be 
calculated from the code rates of the 2 RSCC’s from the following equation. 


J__J_ _1_ 
R ~ R, R 2 


(2.4) 


Ri and R 2 can be different, but they should satisfy R]<R 2 for best decoding performance 
[3]. 

For turbo codes, in order to obtain good results from iterative decoding, only the 
parity bits can be punctured. Thus, after puncturing, the range of the code rates of each 
RSC encoder is between Ro and N p /N p +1. The rate N p /N p +1 appears when only one bit 
in each puncturing period is retained. Generally, high rate codes with a rate 

R= N p /Np+J, 2 <N P <16 (2.5) 
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are considered for constituent encoders with memory size M -4. They will achieve code 
rates from 0.67 to 0.94. Similar rates are always selected for two RSCC’s, thus we have 
Rj= R- 2 . From (2.6), we can determine that only when 


R } — R 2 — 


2N P 

2N p +l 


( 2 . 6 ) 


R will have the value N p / N p +L It follows that for each 2N P information bits, only 2 
parity bits, each from one of the two RSC encoders, will be transmitted after puncturing. 
Thus there are (2N p f possible puncturing patterns to be considered in total. 

Here we give an example to show how to achieve a high rate turbo code with rate 
of 4/5. Each of the two RSC encoders should have a rate of 8/9. Thus, we have R 0 = 1/2, 
N p = 4, 2N P = 8. The following 2 puncturing matrics (2 x 8) are applied. P A is for the 
RSCC1 and Pb is for RSCC2. 



11111111 " 
1 0 0 0 0 0 0 0 


' 11111111 * 

0 0 0 0 0 1 0,0 


(2.7) 

( 2 . 8 ) 


Notation P (a, c 2 ) can be used to indicate the puncturing pattern of turbo codes. If 
the ci,h bit in each period of 2N P parity bits is saved for the RSCC1, and the c 2 , h bit is 
saved for RSCC2, it is described as P(ci, c 2 ). So in our example, we have P(l, 6). Figures 
2.2 and 2.3 show the punctured turbo codes achieved according to the puncturing matrics 
(2.7) and (2.8). Generator matrics used here are both 23_31 for the two encoders. 


Ui u 2 Uj, U 4 , U 5 U6. U 7 Ug U, U 2 Uj, U 4 , Us Uts, U 7 Ug. 



Pu 0000000 p , 9 000000 p N p 12 p l3 ,p, 4 p, s p l6 ,pn Pi&Pit- 

Fig 2.2 Punctured 23_3 1 RSCC1 with rate 4/5 
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U, U 2 U 3 , U4, U S Uf,U 7 Ug U, U 2 U 3 , Us, U S Us, u 7 u s . 



0000 0p2s>0 0 00000 p 7 i4>0 0 . pi 1 pn pu, pu p\ 3 pis, pi 7 pis, P 19 

Fig 2.3 Punctured 23_3 1 RSCC2 with rate 4/5 

2.1.2 Turbo decoding system 
2.1.2.1 General turbo decoding scheme 

The decoding system of turbo codes is much more complicated as compared to 
the decoding system for convolutional codes. The general scheme for turbo decoding is 
shown in Fig 2.4. 



Fig 2.4 General decoding scheme of turbo codes 


The system has two decoders. The first soft output decoder is used with the inputs 
being the systematic information and the output of the RSCC1 (noise added). The output 
of this decoder is an estimate of the information sequence and is called as reliability value 
A. The input of the second soft output decoder is the interleaved new estimate A together 
with the parity bits form the RSCC2. The second DEC produces a new estimate of the 
interleaved information bits. 

, The performance of the system can be improved by iteration, or we say adding a 
feedback path from the output of the second decoder to the input of the first decoder. 
Thus the first decoder can use all the information available instead of only using the 
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systematic information and the output from the first RSCC. The feedback information 
should be independent of the information generated by DEC1, otherwise it will cause 
positive feedback and the decoding could be unstable. 

2.1.2.2 MAP algorithm 

Soft output decoding is applied for turbo codes to improve the performance since all 
the information from the channel can be used without any loss by this method. Several 
algorithms are used to implement soft decision. Among them, MAP algori thm 
(Maximum a Posteriori Probability Algorithm) is the optimal one for decoding. The other 
algorithms also widely used include Max-log MAP (a simplification of MAP algorithm), 
and SOVA (Soft Output Viterbi Algorithm). 

MAP algorithm gives both the decision for every bit and the reliability value for 
the bit. This optimal method can minimize the probability of bit error. A value defined as 

A(u ) = In — ~ — (2.9) 

P(u f = 0) 

is used to determine a soft output value, where P denotes the probability of u, equal to 1 
or 0, the sign of A(u t ) determines whether the bit is a 0 or 1 while the magnitude 
determines the reliability of the decoded bit. Natural log base is always used. For 
derivation of the MAP algorithm, we use the notations below, 
ru a : received sequence from states at time il to time i2 
rj : the entire received sequence (corrupted by noise) 
r, : the received information at time unit i, R, = ( x it y, ) 

Xi : Information bit at time unit i 
yi : parity bit at time unit i 
Si : the state of encoder at time unit i 
s : value of S, 
s value of Sm 

s & s’ = 0, 1 , .... M s where M, is the total number of states 

MAP algorithm gives the decision and the reliability value for any bit given that 
all bits have been received, so we have 
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( 2 . 10 ) 


P(u i =a)= X^ 5 *-.=^ 5 *=^ lr o / } 

In this equation, u, is the input information bit at time unit i. The value of u; equals to a 
which is either 1 or 0. (s’, s) —> ui= a means the possible state transition at time unit i 
while the input bit is a. Equivalently, based on Baye’s rule we have 


P<s,., = s-.S, = s I r/J = 


Define a 


( 2 . 11 ) 


a.(s\s) = P{5._, = s'-S. = s;r { ' } 


Then, we have 


X 0i(s\s) 


A (ii,.) = In 


( 5 \ j )-> tt, =1 




( 2 . 12 ) 


(2.13) 


In MAP algorithm, the probability of state transition is split into three portions. 

< 7 . c y\ s) = a._ v (s') X y;. (j’ f s) x p. (s) (2 14) 

where o;( 5 ) represents the portion that developed from the received information prior to 
the time of the state transition, p(s) represents the portion that developed from the 
received information after the state transition and y,( s’ ,s) represents portion based on the 
received information at the time of state transition. We have 


«w(0*^ H =*W'} 

(2.15) 

P i (s)=P{r/ 1 S, = s} 

(2-16) 

y i (s\s) = P{S i =s-,r i \S i _ l =s'} 

(2.17) 


Next we prove Equation (2.14). Based on Markov property, we know that if the state at 
time i, S, is known, events after time i don’t depend on r 0 ‘ . 
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«,_! (s')x y.(s’,s)x 0.(s) 


= P{S._, = *’;r 0 ' '}xP{S. = s\r. 

1 5. , = s')x P{r f 1 S, 

1-1 9 1 l l 

1-1 


= P{5._, = 5’;r 0 }xP{5. =s;r. 

IS,_, =*’;r 0 JxP{r/ 

i f 


= P{S._, = s'-,S. = s-,r Q }x P{r. 1 

! 5. = s) 

1 f 

i 

= = s’;S. = s-,r 0 }x P[r. 

1 = s ; S. = s; r Q } 


= PiS,^ = = *;*/> 


= 0.(s’,s) 


Now, to achieve the reliability value, we need to calculate ai(s), f},(s) and yfts’ ,s ). 
Oi(s) and fli(s) can be calculated recursively. 

a i (*) = X a i-\ ^ x JO J ) 


s 

We prove Equations (2. 1 8) here, 

s' 

= £ = s'; } x P(S, = i; r, I S,_, = J’) 

j' 

= £P(S W = !';r, H ]xP{S, = j;r, IS,., = 

j' 

= £P(S,., =s';S, = s;/-,‘) 

= -P{5,. = j; rj } = a, (j) 

£A(s)xy,(s’.s) 

S 

= X 1 ^ = *} ■ X ^ = * r, I S M = S') 

s 

= I /'{»■/ 1 s, =s;r,; S,_, = *•> x 

S * W,*-i “ S l 


, PIS,., =1’) 


(2.18) 


= P('i-,IS..,=s') = A-,(s - ) 
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We add superscript to Oi( s’, s ) and y,(s’ , s) to show the information bit at time i ( 
0 or 1 ). Modified equations are shown below 


A («,) = In 


XX*,V,*) 

XZ<r. 0 (s',s) 


(2.19) 


o] (s', s ) = a._ x (s’) x y a . (s', s ) x J3.(s) (2 20) 

a i ( s) = X X (**) x rf (*\ 

a s' 

A-. 

a 5 

The only unknown is }f(s’,s). We have 

= P{5,- = i; r ( . I S t _ i = s’} 

= P{Sj = 5; r,;5,_! = j ’} x P{5, = j = 5’} 

= 5’} P{5, = 5,., = s'} 

= ?±s± = UlilLzi = fl .. = j») (2.2i) 

P{S,^s;S M ms*) P{Si.i = j ’} 

= P{r, 15, = ^ ; 5,_, = *’}x/>{5,. = $ I 5 ,_j = s'} 

= ^ x ^ 

Here Py is a constant since if the 5,.; = s’ is known, the probability of 5/ = s has 
been decided. Then we only need to obtain P x . The r,- is made up of x, and , where x, 
represents the i,h information bit and y, represents the parity bit. Assume that signals 
go through an AWGN channel with noise variance N</2 and BPSK modulation is 
implemented. Thus we actually transmit 1 for u,= land -1 for u,=0. Thus 


x, = (2u i - 1) + noise 
y t = (2 Pi - 1) + noise 


And, 
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= P{ Xi 1 5,. = r,S M = s’}xP{ yi I Si = r,S H = s'} 
= exp[-Ml]xexp [-fiLZfl)!] 




JV„ 


(2.23) 


Thus, Y- (s', s ) = const x exp[- ^' „“- ■■■] x exp[- ■ ] 


Nr 


N„ 


(2.24) 


o o 

Up to now, oti(s), i 3i(s) and y,(s’ ,s ) at any time unit i can be achieved. We can get 
the reliability value based on them. 


A(u,) = In 


S S' 

Z Z ( R i. . j*. *) • Pi (s) • (*’) 


(2.25) 


5 X* 

We give an example here to show the steps of calculations. Assume we have a 
memory size 2 recursive convolutional encoder as shown in Fig 2.5. 



The input - output state diagram of this (5, 7) RSCC is shown in Fig 2.6, 
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To implement the MAP algorithm, trellis diagram is more important. Trellis 
diagram of the encoder is shown in Fig 2.7 


• • • 

• • • 

• • • 


Time 0 ^ Time 1 y 2 Time 2 r 3 Time 3 


Fig 2.7 Trellis diagram of ( 5, 7 ) RSCC 

Assume that ten random bits are generated by the encoder. The information 
sequence is [0010001000], Then the 10 parity bits generated by the encoder are [ 
0011100011 ], After going through the AWGN channel with noise variance 1 .6 and 
puncturing half of the parity bits to achieve rate 1/2 code ( bits deleted by puncturing are 
inserted as zeros), the received sequences are as follows. 

Information bits x, : 

[-1.04 -1.14 1.73 -1.48 -0.02 -1.49 -0.53 -1.71 -1.94 -2.73] 

Parity bits y, : 

[-0.70 0 -0.23 0 1.78 0 -0.59 0 1.53 0] 

The five main steps in the decoding procedure are as follows: 

1. Calculate all jf( s’,s ) 

2. Calculate (%(s) for all states and times from a^s) to a^s) 

3. Calculate fyfs) for all states and times from jfys) to fiofs) 

4. Calculate all G?( s \s) in one time unit 

5. Calculate A(uJ 
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Step 1: Calculation of yd s’, s) 

Assume the constant in the equation [2.24 ] is 1. We show the example for 
calculating yj°( 0,0) and y/( 0,2) here. We have xi= -1.04 , yj= -0.70. For yfi 0,0), u } = - 
1 , p,= -l (Transition noted with 0/0), so 


jf(0,0)=exp[- 


( — 1 -04 — ( — l)) 2 


]xexp[- 


(-0.70- (-1)) 2 


N 0 ' ** N 0 


•] = 0.98 


For y/( 0,2), uj= 1, pi= 1 (Transition noted with 1/1), we have 

r (-1-04-1)% r (-0.70-1)% 
y\ (0,2) = exp[-- — — ]xexp[-- — — ] = 0.02 


N n 


N n 


'0 ”0 
The rest of y,( s’, s) can be calculated in the same way. 


Step 2: Calculation of ads) 

We assume that the encoder started at a^O) =/ and ab(s) = 0 for all s *0. All 
di(s) can be achieved by recursive calculation based on the previous ads) and y,( s’, s). 
We show the example for calculating cti(O) and ai(2) here. Since there is no transition 
available from state 0 to state 1 and state 3 at time 1 yet, so ai(l)=0 and aj(3)=0. And 
a 1 (0)=ao(0)xy I o ( 0,0)= 1x0.98=0.98 
a ] (2)=a 0 (0)xyi 1 ( 0,2)= 1x0.02 =0.02 
Similarly, all cxt(s) can be computed and are as follows: 


i= 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

State 3: 

0 

0 

.02 

.03 

.80 

.12 

.03 

.33 

.18 

.63 

.02 

State 2: 

0 

.02 

.07 

.83 

.11 

.03 

.77 

.18 

.36 

.02 

.33 

State 1: 

0 

0 

.00 

.11 

.06 

.79 

.12 

.36 

.32 

.33 

.63 

State 0: 

1 

.98 

.91 

.03 

.03 

.06 

.08 

.13 

.14 

.02 

.02 


Step 3: Calculation of #(s) 

The calculation of fids) is implemented as backward recursion. The final state of 
the encoder is not known. However, there are two methods that can be used for 
initialization of fiicds). fiio(s) is either initialized as aicds) or as equal weighting as 1/2 M . 
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We use the first method for the initialization and show the example to calculate 09(0) 
here. 

We already know that 0M= a IO (0)=0.02, 0id(2)= a 1( /2)=0.33, y w °( 0,0)=0.18, 
Yio( 0,2)=0.0055, then 

09(0)= 0idO) Yio°( 0,0)+ 0ic(2) Yw( 0,2) = 0.038 

Here we should take care of one more thing. In previous calculation for y( s’, s), 
no normalization for the probability distribution has been done. So, we must do a 
normalization to make the sum of the 0i(s) at any time unit i to be 1. 

After normalization, all 0,{s) are listed below: 


i- 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

State 3: 

.23 

.17 

.10 

.07 

.26 

.44 

.23 

.01 

.34 

.63 

.02 

State 2: 

.16 

.10 

.58 

.26 

.44 

.24 

.27 

.33 

.66 

.03 

.33 

State 1: 

.20 

.51 

.18 

.43 

.07 

.26 

.44 

.64 

.00 

.33 

.63 

State 0: 

.41 

.22 

.19 

.24 

.23 

.06 

.06 

.02 

.00 

.19 

.02 


Step 4: Calculation of ctf s’, s) 

After all the a,(s), 0,{s) and y( s ’> s ) have been obtained, o;( s’, s) can be 
calculated. Example of calculating^/ (0, 2), Oj°(0,0) is shown here, 

(T° (0,0) = a Q (0) x r ° x (0,0) x 0 X (0) 
o\ (0,2) = « 0 (0) x y\ (0,2) x 0 X (2) 


Step 5: Obtain A (ut). 

Then the reliability value of the first information bit is 


A (u 


1 1 < 7 / (s’, s) i 

)= ln ± r _ ^,(0.2) 

II Or'm 


By using the same method, the reliability values achieved for ten information bits 
sre [ -4.67 -2.2 5.13 -6.15 3.77 -6.15 1.66 -6.0 -7.1 -7.8]. From these 

reliability values, we get the complete decoded sequence as 
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I 0 0 1 0 0 0 1 0 0 0 ] 

which is identical to the original information sequence. 

Though MAP algorithm is the optimal decoding algorithm, it has some obvious 
disadvantages. Very large amount of memory is needed for decoding since before fifs) 
can be calculated, all the (Xi(s) at any state and any time must be stored. The calculation 
complexity is very high since large amount of multiplications and additions must be 
implemented. 

We have mentioned that, other than MAP algorithm, SOVA and log-MAP 
algorithms can also be implemented. SOVA compares metric values at each node of 
trellis to decide which path is the maximum likelihood path, similar to standard Viterbi 
algorithms. However, for each node, SOVA also compares the maximum likelihood path 
with the second best path to update a reliability value. This method requires only 
comparisons of metrics and table lookups and only needs one pass through the 
information, while MAP algorithm requires both forward (ctfs)) and backward (firfs)) 
passes. So it is less time consuming than MAP algorithm. Log - MAP algorithm is a 
simplification of the MAP algorithm. It takes the log of the probability distribution of the 
transition jf (s’, s) and replaces them by approximations. Log - MAP algorithm is a better 
approximation than SOVA and there is only a little degradation in performance of log- 
MAP compared to MAP algorithm. 

2.2 Performance of turbo codes 

For a bit error rate lower than 10 -5 , the uncoded binary modulation (BPSK) 
requires the Eb/No to be larger than 9.6 dB. Form our simulation results, we found that 
for rate 1/3 turbo code , to reach a bit error rate of 10“ 5 , the Eb/N 0 for turbo codes can be 
reduced to O.ldB. The performance improvement of turbo codes can be as large as 9.5 
dB. 

The soft-decision decoding performance bounds at different code rates are given 
in Proakis’ book, Digital Communications^, Fig. 5.2.14]. This plot shows the smallest 
Ei/No values to achieve the BER of 10' 5 with BPSK modulation. At the rate equal to 
zero, which means infinite parity bits are added in the transmission together with the 
information bit, the bound is approximately -1.6 dB, which is equal to the Shannon limit. 
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At the rate equal to 1, which means no parity bits are transmitted and is equivalent to the 
uncoded transmission, the bound is given as 9.6 dB and matches the performance for 
uncoded transmission. Fig 2.8 shows the bounds. 

1.0 

0.8 

0.6 
Rc 

0.4 

0.2 

-2 -1 0 1 2 3 4 5 6 

Eb/No (dB) 

Fig 2.8 Soft-decoding bounds at different code rates 

In our research, performance of high rate turbo codes and the bounds are 
compared at BER of 10" 5 . Our simulations are done for rates 1/2, 2/3, 3/4, 4/5, 5/6, 
10/11, 15/16, and 16/ 17 with the best selection of parameters. A two - encoder parallel 
- concatenation system with memory size 4 is implemented. We select the generator 
polynomial as 23_31 since it is the best choice and implement a peusdo-random 
interleaver with size 256x256. The selection of puncturing patterns is according to the 
recent paper “High Rate Turbo Codes for BPSK/ QPSK Channels”[3] and our research. 
In Table 2.1, puncturing patterns selected for different code rates are listed. For rates 5/6, 
10/11, 15/16, modified puncturing patterns are applied to achieve the best performance 
(which we will discuss later). MAP algorithm is applied in decoding since it is the 
optimal soft decoding algorithm. The number of iterations in the decoding process is set 
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to be 18. In each set of simulations, the total number of information bits is 10. That is, 
the BER value (bit error rate) we can test is down to 10* 6 level. 

Fig 2.9 shows the performance after 18 iterations for different code rates. The 
Eb/No values where BER of 10' 5 can be achieved by different rate codes are also listed in 
Table 2.1, and they are compared with the Shannon limit. From the results, we see that 
the performance of turbo codes at all these rates is within 0.5 dB from the Shannon Limit. 


Table 2. 1 Performance of high rate turbo codes 


Rate 

1/2 

2/3 

3/4 

4/5 

5/6 

10/11 

15/16 

16/17 

Puncturing patterns 

P(U) 

P(3,4) 

P(3,5) 

P(7.6) 

modified 

Modified 

Modified 

P(2,2) 

Eb/NofdB) 

0.75 

1.55 

2.1 

2.5 

2.8 

3.7 

4.2 

am 

Distance(dB) 

0.45 

0.5 

0.5 

0.5 

0.5 

0.45 

0.4 

0.4 



Fig 2.9 The performance of turbo codes at different code rates 


Fig 2.10 shows the performance of turbo codes compared to some block and 
convolutional coding schemes and the performance bounds. 
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Fig 2. 10 The performance of turbo codes compared to some 
convolutional and block turbo codes, also the Shannon limit 
In our simulations, we studied the effects of system components and parameters 
on the code performance. All possible components or parameters which affect the 
performance are listed in Table 2.2 . In next section, we will discuss the dominant factor 
for turbo code performance. 


Table 2.2 Factors for the performance of turbo codes 



Structure 

Parallel concatenation/ Serial concatenation 
Levels of concatenation 


Convolutional encoder 

Memory size (constraint length) 
Systematic / nonsystematic 
Recursive / non-recursive 

Encoding 


Generation polynomial 

System 

Interleaving 

Nonrandom / random 

Algorithm used for random interleaving 

Size 


Puncturing 

Code rate 
Puncturing pattern 

Decoding 

Algorithm 

Soft / hard 

system 

iterations 
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23 Output weight distribution and performance bounds of turbo codes 
23.1 Output weight distribution 

The output weight distribution is the new concern of turbo code researchers. The 
relation between the turbo code performance and the output weight distribution has been 
studied extensively. At first, the performance of turbo codes was claimed to be mainly 
decided by the lowest weight code word (which equals to the free distance of the code) 
together with the effective multiplicity of these free distance code words [ 18 ]. Then new 
thoughts came out that the whole output weight spectrum should be considered to 
estimate the code performance [ 25 ]. 

Linear recursive systematic codes are used as the component codes of turbo 
codes. The minimum output weight of the codes is equal to the free distance of the code. 
For punctured high rate turbo codes, the minimum output weight decreases, but it should 
be proportional to the free distance of the codewords that are generated by the component 
RSCC. Since better error detection and correction capability can be expected when the 
free distance of the codewords is larger, minimum output weight can be seen as the 
dominant factor of the code performance. 

Low weight output sequence is always generated by the low weight input 
sequence. The output sequence is made up of the input information sequence and the 
generated parity sequence. So when the information sequence itself has high weight, the 
output sequence will definitely have high weight too. Also, high weight information 
sequence has very little chance to generate low weight parity sequence. 

We know that interleaver in the encoding system makes the possibility of all 
encoders to generate low weight output simultaneously to be very low. We can prove that 
higher the weight of the input sequence, the lower the possibility will be. For the case of 
using perfect random interleaver with size L =AxA, we assume a weight-w information 
sequence. The information sequence has a nonzero-bit distribution which causes the low 


weight output from the first RSCC. The probability for the interleaved information 
sequence, as the input to the second RSCC, to also have the nonzero-bit distribution to 
cause low weight output can be approximately represented by 
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This probability is achieved approximately when we assume that the interleaver size is 
large enough so that the block edge effects are negligible. 

For example, assume an input sequence [1001000 0....] with weight 2, which 
can cause the low weight output in the first RSCC. After interleaving, the probability for 
the interleaved sequence to also have 2 zeros between its two nonzero bits will be 
roughly 2/L. Now we explain how this 2/L is calculated. Interleaver is used to change the 
permutation of the bits in the information sequence. For the weight 2 information 
sequence, after the location for the first nonzero bit has been decided, there are L-l 
locations left where the second nonzero bit can stay. Among these L-l locations, two 
locations, which are 3 bits ahead of the first nonzero bit and 3 bits after the nonzero bit, 
cause low weight output. So the probability of low weight output is 2/(L -1). It 
approximately equals to 2/L when L is a large value. Similarly, if assume a weight 3 input 
sequence, the probability for the interleaved sequence to cause low weight output is 
approximately 4/L 2 . 

From equation (2.26), we can see that the probability of a weight w+1 input 
sequence to cause low weight output is only about 1/L of the weight w sequence. Thus 
we can draw the conclusion that the free distance code word is most possibly to be 
generated by the minimum weight information sequence. Some researchers assume that 
the weight 2 information sequence is the dominator of the performance of turbo codes 
because weight 2 is the smallest weight to cause low weight output (weight 1 input 
sequence will never cause a low weight output). We have suspicion on this assumption. 
In a practical consideration, for an information sequence with length L, it is appropriate to 
assume each bit in the sequence to have half probability to be 0 and half probability to be 
1. So the probability for the weight 2 input sequence to happen, especially when L is a 
large number, should be extremely small. We feel the proper assumption is just that the 
minimum weight input sequence generates the minimum output weight sequence. 

Also with our above assumption for the practical consideration, we can expect the 
weight distribution of the information sequence to show an approximate Gaussian 
distribution. And we can expect the output weight distribution spectrum to also have a 
similar shape. That means large percentage of the information sequences have about 
middle weight and only a small proportion of information sequences are the low weight 
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or high weight sequences. So although the low weight information sequences will have 
comparatively much larger influence on the code performance, their multiplicity is much 
lower than the middle-weight Information sequence. So, for an accurate evaluation of the 
code performance, considering only the influence of low weight outputs is not sufficient. 
That is why the output weight distribution spectrum should be taken into consideration 
for a better estimation of the code performance. 

To achieve an improvement in the turbo code performance, we want the minimum 
output weight to be higher and the multiplicity of the low weight codewords to be 
smaller. This aim can only be achieved when the variance of the distribution spectrum is 
decreased. It follows that when we try to improve the turbo code performance, all we 
need to do is try to decrease the variance of the output weight distribution. 

23.2 Performance bounds 

Since turbo codes are generated by the parallel concatenation of two or more 
recursive systematic convolutional encoders, we can achieve the performance bounds of 
turbo codes from the analysis of the systematic convolutional codes bounds. For a block 
of information bits with length equal to L, we know there are totally 2^ possibilities of the 
code words. By the theory that the sum of the probabilities of individual events is no less 
than the probability of the union of the events, we can state that P b , the error probability 
of the convolutional codes, should be no greater than the sum of the error probabilities of 
each of the 2 1 possible code words. 

2 l 

P b *lP c . ( 2 . 27 ) 

i= 1 

where P c represents the error probability of each of the possible code words. 

Assume that the signal energy per information bit is E b , then the received signal 
energy per code word (information + parity) bit is RE b . R here represents the code rate. If 
BPSK modulation is used, it follows that *+r and -1’ are transmitted. Also assume an 
additive white Gaussian noise (AWGN) channel. We have the Gaussian noise added with 

mean at ±*jRE b and variance equal to No/2. It is well known that the error probability of 
each code word is given by 
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(2.28) 
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In which , 

P c . error propability of each one of the 2 1 possible code words 

Es. signal energy per code word bit 

En. average energy of Gaussian noise 

qj. weight of the information bits of a certain code word 

d: Hamming weight of the codewords 

Eb/No- signal to noise ratio 

R: code rate, the ratio of the number of information bits to the codeword length 
L: The size of the block of information bits, or information sequence length. 

Q(x): an Gaussian cumulative distribution function. 0-function is defined as the 
integral of zero mean, unit variance Gaussian density function from certain point 
x to infinite. Here, 0-function shows the probability of error happening when the 
total Hamming weight of a code word is d. 


Combining equations (2.27) and (2.28), BER performance of a finite length 
convolutional code with maximum-likelihood decoding (MLD) on an AWGN channel 
can be upper bounded by using the union bound 



(2.29) 


To make the calculation of the bounds more convenient, we make a small 
modification to collect the codewords of the same d. 


d^dfree 




2 RE, 
N n 


-) 


(2.30) 


Here, we have 

d/ree- The minimum Hamming weight of all possible codewords, free distance 
dmax '■ The maximum of the Hamming weights of all codewords which is equal to 
UR. 
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S > d : The average weight of the information bits when the Hamming weight of the 
codeword is d. 

Nj: The multiplicity of code words with Hamming weight d. 

An effective multiplicity of code words with weight d can be defined as N</L. 


This procedure of deriving the upper union bound of the performance of 
convolutional codes can also be used to derive the bound of turbo codes. We know the 
lowest weight output, which decides the free distance of the code, can be regarded as the 
dominant factor for the code performance, so further simplification can be applied. We 
get a performance bound of turbo codes based on the free distance of the code and the 
multiplicity of all the free distance code words. 


N,. & 


P>*' 


I, 2RE, 


N „ 


<2.31) 


We have mentioned that some researchers assume that the free distance code 
words are formed by the weight 2 input sequence. In this case the bound is simplified as 
N- -2 1 2 RE, 


P b ^ 


■QUd 


free 


N n 


-) 


(2.32) 


in which N 2 represents the multiplicity of free distance code words caused by weight 2 
information bits. But from our analysis above, we prefer to say that the free distance 
codewords are generated from the lowest weight input sequence, but not necessarily to be 

2 . 


From our knowledge of the Gaussian density distribution function, equation 
(2.32) implies that smaller Hamming weight d causes larger value of the Q function and 
in turn, larger error probability. That is, smaller the free distance of the codes, the worse 
the performance. 


2.4 Relation between the system parameters and output weight distribution 

There are tight relationships between the output weight distribution and the 
generator polynomials, interleaver and puncturing patterns. Based on these relations, we 
can find the criterion to select the best parameters that can help to decrease the variance 
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of the output weight distribution spectrum, and thus to improve the performance of the 
turbo codes. In our work, extensive simulations have been done to search for the best 
parameters and help to prove our selection criterion. In these simulations, we 
implemented a two encoders parallel concatenation system with memory size 4. MAP 
decoding algorithm is applied with 18 iterations. In each group of simulations, the total 
number of information bits simulated is 10 7 . 

2.4.1 Generator polynomial 

Our first consideration is the component code generated by the convolutional 
encoder. Based on our formal analysis, the low weight code words are mainly formed by 
the low weight information sequence. So our main concern is on the low weight 
information sequence. 

The implementation of RSCC (Recursive Systematic Convolutional Encoder) is 
important. NSCC (non-recursive systematic convolutional encoder) maps a finite weight 
input sequence into a finite weight output sequence. The output weight of the NSCC is 
correlated with its input weight and can not satisfy the requirement of random-like codes. 
The improvement of RSCC is obtained because a finite weight input sequence can be 
mapped into an infinite weight output sequence. The output weight of RSCC has the 
same distribution as that of a random code sequence. RSCC gives the greatest gain when 
used as parallel concatenated codes. 

If NSCC is used, the output weight of the low weight information sequence will 
always be low. RSCC provides significant improvement in the output weight of parity 
sequence. The generation of most of the low- weight codewords is avoided by the use of 
RSCC because of the contribution of the feedback structure of the encoder. This structure 
makes the previously encoded information bits feed back continuously to the encoder’s 
input. However, for small number of low weight information sequences with certain 
nonzero bits distribution, low weight output will still be possibly formed even by RSCC. 
An example to illustrate this is given below. 

We use a weight-2 information sequence for the example. Assume that we have a 
RSCC with memory size M = 2 as shown in Fig 2. 1 1 The feed-forward and feedback 
polynomials of the RSCC are 1+ D 2 and 1+D +D 2 respectively. The weight 2 information 
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sequence is assumed to be [1 0 0 1 0 0 0 0 0 0 ... ] and can be described as 7+ D 3 . The 
low weight output parity sequence [111 1 00000. ...J is formed by our encoder. When 
the first nonzero bit of the information sequence comes, the trellis path of the 
convolutional codes diverges from the all zero state. Later, when the second nonzero bit 
inputs to the encoder, it happens to drive the encoder back to the all zero state. After that, 
all the remaining bits in the information sequence are zeros. None of them can lead the 
encoder away from the all zero state again. So, other than the first four 1 ’s, all other 
parity bits are zeros. The weight of the parity bits thus is only 4. As a comparison, we 
assume another weight 2 sequence, such as [1 000 1 0000 0 ....], or described as 7+ 
D 4 . This time the second nonzero bit in the information sequence does not drive the 
encoder back to the all zero state, thus the subsequent zero input and the feedback of the 
encoder force the encoder to go through a loop of several different states. The parity bits 
formed by this sequence will be [1 1 1 0 1 0 1 1 0 1 1 0 1...], a high weight output 
sequence. 



Pi 

Fig 2.1 1 An example of turbo encoder with M-2 and feedback polynomial 1+D +D 2 
These two examples show that weight-2 information sequences can possibly 
generate low weight output or high weight output sequences. The difference between two 
input sequences is the distribution of the nonzero information bits. For the encoder used 
in the example and any weight-2 input sequences, there are several possibilities of the 
nonzero bits distribution that can cause low weight output. The first case is the sequence 
which can be described as 7+ D* z , where z is a small integer larger than 1. All these 7+ 
D 31 input sequences can be divided by the feed back polynomial 1+D +D 2 , so the second 
nonzero bit of the information sequence drives the encoder back to the all zero state as 
was the case for 7+ D 3 sequence. Since when z increases from 1, the weight of the parity 
bits becomes a little higher than the z — 1 case, we should note that z can only be a very 
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small integer. Otherwise, even though the second nonzero bit finally drives the encoder 
back to all zero state, the output weight of the parity bits has been large enough before 
that. Some delayed version of the 7+ D 3 * sequences can also give the low weight output. 
These group of sequences can be described as Iff 1+ D* z ) where z’ is also a small integer 
greater than 1. For example, a delayed version of our input sequence in the first example 
[100100000 ...] is [000 1 00 1 000 0....], which will also cause low weight parity 
bits. Other than the first and second cases, low weight output can be generated when the 
first nonzero information bit appears at the very end of the input sequence. In this case, 
although the second nonzero bit is not 3z bits away from the first nonzero bit, low weight 
output will be generated. For all weight 2 information sequences, other than these three 
cases, the output sequence will actually have infinite output weight if no termination is 
executed at the end to make the parity sequence have the same length as the information 
sequence. Even with the termination, the weight of the output sequence will still be high. 

Input sequence can have an even lower weight than 2, the weight 1 case. Though 
weight 1 information sequence will definitely cause very low output weight for an NSCC, 
it will not be the case for RSCC. The code word generated by the weight 1 information 
sequence will be of infinite length without termination. This is due to the fact that after 
the only nonzero information bit causes the trellis path to diverge from the all zero state, 
there will never be another nonzero bit in the information sequence to remerge the path 
back to the all zero state. Thus for weight 1 input, the only possibility to form low weight 
output is that the 1 appears at the very end of the sequence. Now let us see what happens 
to a low weight information sequence that has a weight larger than 2, but still a small 
value (low weight is the main concern for the performance). Similar to weight 2 case, 
some of the nonzero bit distributions cause the low weight output while others cause 
infinite output weight when no termination is done. If a low weight information sequence 
is made up of several weight 2 sequences that cause low weight output, the information 
sequence will cause low weight output too. For the encoder in Fig 2.1 1, we found that 
some sequences which can be described as If ( l+ lf z ), cause low weight output 
when z is small and there are not too many components to be summed up. For example, 
low weight parity sequence is generated from weight 4 information sequence [1 0 01 0 ... 
0100100 ....]. 
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Encoders with primitive feedback polynomials are the best choices because they 
help to achieve large free distance of the codes. Assume we have the generator matrix of 
the RSCC encoder as follows 


G,(Z>)= 1 


g,q» 

8 0 O» 


( 2 . 2 ) 


where gj(D) and go(D) are referred to as the feed-forward and feedback polynomials 
respectively. Again, weight 2 information sequence will be considered here to simplify 
our analysis. What we want to maximize is the weight of the parity bits, that is 

p(D)=d(D) ffaT) (2 - 33 > 

Here we name u(D) as the information sequence and p(D) as the parity sequence. For the 
weight 2 input case, we assume an u(D) = 1+ D e . The e is a finite value selected to be the 
smallest to make this information sequence to generate low weight output (free distance 
code word). Then p(D) can be written as 


8od>) g 0 (D) g 0 (D ) 


(2.34) 


Since the nonzero part of p(D) is also of finite length e, g,(D) / g0 (D) must be periodic 
with period e. On average, half of the bits in the e long nonzero subsequence of p(D) will 
be 1 and counted for output weight. Approximately, we can predict larger value e will 
mean higher weight for the parity bits. This period <?, for a strictly proper rational 
function of two polynomials such as g,(D) /g</D), is a value no larger than 2" -1 , where 
M is the number of memories used in the encoder, e reaches the maximum when the 
feedback polynomial go(D) is primitive. On average, primitive polynomial results in 
larger free distance in turbo codes. 

In our simulations, we set the memory size of the recursive encoder to be 4. Thus 
we can expect the maximum period e to reach 15 when the feedback polynomial is 
primitive. For the case M=4, there exist two primitive polynomials, 1 + D +D 4 and 1 + 
D 3 + D 4 . Written in the octal number, they are 23 and 31. For the feed-forward 
polynomial, the criterion to select the best has not been found yet. However, since there 
are only 2 possibilities of the feedback polynomial and only 8 choices (21,23, 25, 27, 31, 
33, 35, 37) for feed-forward polynomial, all the possible combinations are 16. It is not too 
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large a number, so it’s possible to find the best combination from simulations. Our 
simulations show that, among all the combinations, (23,31) generator polynomial gave 
the best performance. We take this combination to be the optimal choice of the generator 
polynomial when the memory size is 4. However, our simulations showed that the 
performance of the turbo codes is not significantly different with different generator 
polynomials. Fig 2.12 shows the encoder with the (23, 31) generator polynomial. Fig 2.13 
gives the results for one group of comparisons between the (23, 31) and (31, 27) codes. 



Fig 2.12 Recursive encoder with generator polynomial (23 , 3 1 ) 



Fig 2.13 Comparison of the (23, 31) and (31, 27) generator polynomials 


2.4.2 Interleaver 

Interleaver is regarded as the most important component in turbo encoding 
system. We examine the influence of the interleaver size and the interleaver algorithm on 
the output weight distribution. 
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In a two encoders parallel concatenation system, the interleaver is used in turbo 
coding system to change the distribution of the information sequence before it inputs to 
the second encoder. So the input sequences to the two encoders in the system are actually 
different. If the original input sequence has a very low weight, and its nonzero bit 
distribution happens to cause low weight output parity bits in the first encoder, it’s very 
unlikely that the input sequence to the second encoder, after the interleaving, will still 
have a nonzero bits distribution which will cause low weight output. 

We want the interleaver to make the probability of low weight output, 
simultaneously from both encoders, to be very small. This probability depends on the 
algorithm used for the interleaver. Random interleaver is preferred over nonrandom 
interleaver since the ability of the random interleaver to break the correlation between the 
bits of the information sequence is much better than the ability of the nonrandom 
interleaver. It can make the output weight distribution to have a shape similar to the 
random codes, which makes the performance get very close to the Shannon limit. Also 
we want the interleaver size (equivalent to the length of the information sequence) to be 
as large as possible. Larger interleaver size is required by the performance bounds 
equations to provide low probability of coding error. And it’s easy to determine that the 
probability of the simultaneous low weight output from both encoders is inversely 
proportional to the interleaver size. The probability can be decreased significantly when 
the interleaver size is increased. 

The interleaving algorithm and the interleaver size are the two considerations in the 
selection of the interleaver. We have mentioned that for the first concern, random 
interleaver is preferred than non-random interleaver. Pseudo - random interleaver is 
selected in our simulation work since it is most commonly used random interleaver. Our 
main concern is how significant is the influence of the interleaver size on the 
performance, and what will be the appropriate interleaver size to be used in practical 
applications. For the appropriate size, two factors should be considered. First, we know 
the bit error rate (BER) decreases with the increase in the size of the interleaver. This 
effect is called interleaver gain and demonstrates the necessity of larger interleaver. On 
the other hand, increasing the interleaver size causes an obvious slow down of the speed 
of the turbo codes, especially the speed of turbo decoding. This is because a certain 
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number of iterations are needed in the decoding process to improve the performance. And 
in each iteration, the interleaving and deinterleaving (inverse function of interleaving) 
processes must be executed several times. Thus a tradeoff is needed between the better 
performance of the turbo codes and the real time decoding. 

In order to observe the code performance with different interleaver sizes, we set 
the generator polynomial to be (23, 31) and select the 4/5 code rate to do the simulations. 
Puncturing pattern is selected to be P (7, 6), which is claimed by [3] to be an optimum 
choice. The interleaver sizes 256x256, 128x128, 64x64, 32x32, 16x16, are compared. To 
maintain the number of bits being tested in each case (about 10 7 ), the numbers of blocks 
selected for each simulation are 150, 600, 2400, 9600, 38400. The simulation results are 
shown in Fig 2.14 and Table 2.3. 



Fig 2.14 The influence of interleaver size on the performance of turbo codes 


Table 2.3 Performance of 4/5 turbo codes with different size interleavers 



256*256 

128*128 

64*64 

32*32 

16*16 

Eb/NoOO 5 )(dB) 

2.5 

2.6 

2.8 

3.4 

>6.0 

Distance to bound 

0.5 

0.6 

0.8 

1.4 

>4.0 

Coding gain 

7.1 

7.0 

6.8 

6.2 

<3.6 
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There are several observations from the simulation results. First, with the decrease 
of interleaver size, the performance of the 4/5 turbo codes decreases quickly. Table 2.3 
shows the Eb/No values with different interleaver sizes to reach the BER of 10' 5 . The 
performance is also compared with Shannon limit. The coding gain and the distance from 
the bound at rate 4/5 are given in the table. Second, the run time of the program increases 
significantly when the interleaver size is increased. The third observation is the so called 
floor flaring effect, which is a phenomenon that when the Eb/No value increases steadily, 
the rate of improvement in performance decreases significantly. This is a serious effect 
because large increase of Et/N 0 can only achieve very little improvement in the 
performance of the turbo codes. The error floor effect is found to be caused by the 
performance union bound of the turbo codes and happens when the performance is near 
the bound. When it happens, the slope of the curve drops and then keeps the same as the 
slope of the bound. In our simulation results, no floor flaring effect is found for sizes 
256x256 and 128x128. Thus the floor flaring effect happens at lower than lO' 6 level for 
these two cases. It can not be observed because it is beyond the capability of our 
simulation. But for sizes less than 64x64, the floor flaring effect is obvious as a low slope 
region of the performance curve. The BER values where the error floor effect appears are 
listed in Table 2.4. If lO" 6 is set as a level to decide if the error floor flaring effect is 
significant enough to influence the performance of turbo codes, then for the Pseduo- 
random interleaver we used, 64x64 is the minimum size which can be accepted. 

Table 2.4. Floor flaring effect for different interleaver sizes 



Error floor effect 

256*256 

«10' 6 

128*128 

cclO* 

64*64 

BetweenlO*' and 10* 

32*32 

Between 10^ and 10* 

16*16 

Between 10 4 and 10* J 


2.4 3 Puncturing pattern 

Puncturing is also an important factor to determine the performance of the turbo 
codes. The higher the desired code rate, the more parity bits need to be punctured, and the 
poorer the performance of the turbo codes. From another view, puncturing causes the 
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decrease of the output weight, which decreases the free distance of the code and degrades 
the performance of the codes. 

Puncturing pattern is used to decide which parity bits should be punctured and 
which should be kept after puncturing. Notation P(cj, c 2 ) is used to indicate the 
puncturing pattern of turbo codes. In [3], the author claimed that from their simulation 
results, some of the puncturing patterns, such as P(3, 4 ) for rate 2/3 and P( 7,6 ) for rate 
4/5, are the optimal choice. No theoretical proof was given in the paper. Our opinion is 
that the selection of the puncturing pattern has some relation with the interleaver 
algorithm. Different interleavers will have different requirements for the puncturing 
pattern. For the pseudo - random interleaver, which is used in [3] and also in our 
simulations, we don’t think the value of c/ or c 2 has any significant influence on the code 
performance. After the random interleaving, the order of the information bits has been 
completely changed. There is no reason to say that keeping the c ]t h bit in the first parity 
sequence and the c 2t h bit in the second sequence is better than other choices. To prove our 
thinking, we did a set of the simulations. Table 2.5 shows the different puncturing 
patterns in the simulations. The second column of the table gives the patterns that were 
claimed to be optimal by [3] at four different rates. The other two patterns have been 
randomly selected for comparisons for each rate and they are listed in the third and fourth 
columns of the table. In the fifth column of the table, the Eb /No value selected for each 
rate to do the simulations is listed. 


Table 2.5 Puncturing patterns selected for different code rates 


Code rate 

Puncturing Pattern 
form [3 ] 

Random Puncturing 

Pattern (1) 

Random Puncturing 

Pattern (2) 

Eb/No 

(dB) 

2/3 

P(3,4) 

P(l,2) 

P(U) 

1.6 

% 

P(34) 

P(3,3) 

P(3,2) 

2.1 

4/5 

P(7,6) 

P(l,6) 

P(l,l) 

2.5 

16/17 

P(2,2) 

P(2,l 1) 

PCll.il> 

E9B 
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[23, 31] code \tfth r*3/4 at EttfNo « 2.1 






Fig 2. 1 5 Comparisons of different puncturing patterns for high rates at certain Eb/No 

Fig 2.15 shows the simulation results for all the four rates at these Eb /N 0 values. 
We compared the performance for the three different selected puncturing patterns at 
different iterations. As expected, there’s no obvious difference in the performance of the 
three different puncturing patterns for all of the rates. The puncturing pattern selected by 
[3] can not be distinguished to be the optimal choice. 

In our simulations, we happen to find that for some special rates, turbo codes 
show very poor performance, and the increase of the interleaver size does not show large 
improvement as for other rates. Fig 2.16 shows the case for one of the special rates, rate 
5/6. In Fig 2.16, we see BER reaches 10” 5 at Ei/No =8.3. This performance is much 
worse than other rates and far from what is expected. Other special rates giving poor 
performances are 10/11 and 15/16 within our concerned range (rate 1/2 to 16/17). We 
find for these special rates, puncturing pattern is no longer showing insignificant effect on 
the performance. Next, we give the explanation for the poor performance of these rates, 
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and then we offer the modified puncturing patterns we designed, which successfully 
improved the performance of these codes to a level as good as all other rates. 



Fig 2.16 The performance of 5/6 code with different interleaver sizes 

As we mentioned above the impulse response of single data input shows a 
periodic pattern of parity bits at output of the recursive encoder because of the feedback 
structure. And since the encoder is linear, an input of two or more data will yield a sum of 
shifted versions of periodic patterns and is essentially periodic. Thus a period structure 
exists in the output parity sequence of the encoder. When the primitive feedback 
polynomial is implemented, this period can reach the maximum. For our M=4 case, that 
period is 15. That is to say, in this period, 15 different locations are possible to be 
selected as the kept bits. Comparing the influence of this period of 15 on the normal and 
special rates, we found that for normal rates, the bits on all the 15 locations have the same 
probability to be kept by puncturing. But for the special rate, only the bits on some of the 
15 locations are possible to be kept, and the bits on all other locations will never be 
selected. To make it more clear, we give an example of the comparison between rate of 
2/3 as normal rate and rate of 5/6 as special rate. For rate of 2/3 turbo codes, we keep 1 
bit in every 4 parity bits to achieve the desired code rate. Similarly, for rate of 5/6 turbo 
codes, we keep 1 bit in every 10 parity bits. Without any essential loss of generality, for 
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2/3 code rate, we choose a puncturing pattern which keeps the third bit in each 4 parity 
bits, and for 5/6 code rate, we choose the first bit in each 10 parity bits. Table 2.6 and 
table 2.7 below show how many different locations in the period of 15 can be selected for 
these two rates. 


Table 2.6 selected bit locations after puncturing for 2/3 rate 


3 rt in each 4 parity bits 

3 

7 

11 

15 

19 

23 

27 

31 

Location in period 15 

3 

7 

11 

15 

4 

8 

12 

1 

3 rt in each 4 parity bits 

35 

39 

43 

47 

51 

55 

59 


Location in period 15 

5 

9 

13 

2 

6 

10 

14 



Table 2.7 Selected bit locations after puncturing for 5/6 rate 


l st in each 10 parity bits 

1 

11 

21 

31 

41 

51 

61 

71 

Location in period 15 

i 

11 

6 

1 

11 

6 

1 

11 

l rt in each 10 parity bits 

81 

91 

101 



131 

141 

... 

Location in period 15 

6 

1 

11 

6 

1 

11 

6 



From Table 2.6, we see that in the case of 2/3 rate, in the first 4 parity bits, the 3 rd 
location is picked. In the second 4 parity bits, the 7 th location is picked. Then, the 11 th , 
15 th locations are picked in the 3 rd and 4 th group of 4 parity bits. Thus, as shown in table 
5, we found that all of the 15 locations can be selected with same probability. Then we 
look at the special 5/6 rate case in Table 2.7. In the first and second group of 10 parity 
bits, the 1 st and the 1 1 th locations are selected. Then in the 3 rd group, the 6 th location is 
picked. Then in the following groups, we see from the table that the 1 st , 1 1 th , 6 th locations 
are selected over and over again. Thus, only a limited number of locations are picked in 
this case (This is also true in the other special code cases such as 10/1 1, 15/16). When the 
information sequence has low weight, there is large probability that in the long length of 
consecutive periods of 15, the value of the bits on some locations is always the same. 
Suppose these bits are all zeros, then after the puncturing, the weight of the output 
sequence will be very low because the weight of the retained parity bits is too low. Under 
such circumstances, even if the encoder itself doesn’t generate the low weight output 
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sequence, the final output weight is very low because of the inappropriate puncturing 
pattern. And this is the reason for the poor performance. 

The solution of this problem is to modify the puncturing pattern so that more of 
different locations can be selected for these rates. Here we show how we designed the 
alternative pattern for 5/6 code rate to improve the performance. We have calculated that 
if the first bit in each 10 parity bits is retained, only the 1 st , 1 1* and 6 th locations in the 15 
locations can be selected. Similarly, if the second bit in each 10 parity bits is retained, it 
picks the 2„d, 12th, and 7* locations. In Table 2.8, P(i) in the first column represents the i t h 
bit retained in each 10 parity bits. The second column shows the locations that can be 
picked in the period of 15. From this table, it’s not difficult to observe that if we select 
the 1 st , 12 th , 23 rd , 34 th , 45 th bit in every 50 parity bits, all the locations are selected and 
we can still maintain the code rate to be 5/6. Simulations were performed to examine this 
alternative puncturing method. Fig 2.17 shows that with all different interleaver sizes, the 
performance of the codes improved significantly. For larger interleaver sizes, the 
improvement is especially significant. The Et/N 0 value to make BER reach 10 s is 
decreased from 8.3 dB to 2.8 dB by implementing our modified puncturing pattern. 





Eb/N0(dB) 




— norma! 

— special 


Fig 2. 17 The improvement of the performance with the modified 
puncturing pattern at different interleaver sizes 
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Table 2.8 The locations selected by selecting different bits 
in each 10 parity bits for 5/6 rate 


P(l) 

1.11.6 

P(2) 

2.12,7 

P(3) 

3,13,8 

P(4) 

4,14,9 

P(5) 

5, 15,10 


For rate 10/11 and 15/16, same method is used to design the new puncturing 
patterns. In Fig 2.18, we can see the performance of the three special code rates (5/6, 
10/11, 15/16) improved very significantly. 



Eb/NOfdB) 

Fig 2.18 The improvement of the performance with the modified 
puncturing pattern at code rates 5/6, 10/1 1, 15/16 
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CHAPTER 3 

ITERATIVE BLOCK DECODING 


Many efficient algorithms have been found for using channel measurement 
information (soft decisions) in the decoding of convolutional codes than in the block 
codes, so researchers are concerned with the maximum likelihood decoding of linear 
block codes using channel measurement information. This decoding method is 
particularly useful to decode the high-rate codes because the complexity will increase 
very fast with the increase of the parity bits. To implement maximum likelihood decoding 
on linear block codes, it’s necessary to construct a trellis for the block code. So in section 
3.1, the method to construct trellis from a linear block code will be introduced. The 
iterative log-likelihood decoding algorithm is given in section 3.2. The implementation 
of this method using trellis is discussed in section 3.3. 

3.1 Construction of trellis from block codes 

3.1.1 Characteristics of the trellis constructed from block codes 

Soft decision, maximum likelihood decoding of any (n, k) linear block code can 
be accomplished by using the Viterbi algorithm. If the block code is over GF(2), the 
trellis constructed will have these characteristics: 

1) The depth of the trellis is n. 

2) There are no more than 2 {n ~ k) states in the trellis. 

3) There are 2* paths through the trellis, each of the 2* distinct codewords correspond 
to a distinct path. 

4) Each node in the trellis represents an ( n-k ) tuple with elements 0 or 1 (the two 
elements of GF (2 ) ). 

5) Each transition between two states is labeled with the appropriate codeword symbol v*, 
the first k symbols represent the k information bits ut, the following n-k symbols 
represent the parity bits. 
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There are also some other properties for special block codes: 

1) For the cyclic code, the trellis is periodic. 

2) For a productive code, the number of states in the trellis can be much less than 
2 ( " _i) [5]. 

3) For the single parity check code, the Viterbi algorithm applied to the trellis is the same 
as the Wagnar decoding. 


3.1.2 The method of construction 

The general formulation of the trellis for linear block codes uses the systematic H 
matrix of the code. Compared to the trellis of the convolutional codes, the structure of 
trellis formed from block codes is irregular, s/i) is used here to represent the nodes at 
depth i, and the subscript ‘/ represents they',* state in the total 2 (B '* ) states, r, is used as 
the input bit between depth i to depth i+1, and h t is used as the i, h column of the H 
matrix. Then, the steps for constructing a trellis are shown below: 

1) The trellis starts at depth i=0 with the all zero state, named as s 0 (0). 

2) At each depth i, the collection of nodes at depth (i+1) is obtained from the collection 
of nodes at depth i, the formula used is shown below: 

s,( i + 1 ) = Sj( i) + n h M ( 3 . 1 ) 

3) Nodes and lines that do not end at all zero state at depth n are removed. 

Here we give an example of a Hamming code to show how to follow these three 
steps to construct the trellis. 

Hamming codes are block codes with code rate (2 m -l,2 m -1-m), given by 


(1.12) and (1.13). For convenience, we choose m= 3 and thus we get a Hamming code 
with (7,4) code rate. The minimum distance of the code is 3. It means that we are able to 
detect two errors but can correct only 1 error by using this code. Assume we have a 
systematic H matrix as below 


H = 


1 1 1 
0 1 1 
1 1 0 


0 10 0 
10 10 
10 0 1 


(3.2) 


We can construct the trellis with the H matrix. The trellis should be from depth 0 
to depth 7, and have at most 8 states in each depth from 000 to 111. 
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The initial state at depth 0 is 000 as described in step 1. The input between depth 
0 and depth 1 has two possible values, 0 and 1. We can calculate the states at depth 1 by 
using equation (3.1). Then we have as follows 

when input is 0, sd( 1) = so(0) + Oxhl = 000 + 0x101 = 000 (3.3) 

when input is 1, ss(l) = srfO) + lxhl = 000 + 1x101 = 101 (3.4) 

hi here is the transpose of the first column of H matrix. 

Thus at depth 1, we will have two states 000 and 101. Then by the same method, 
at depth 2, four states are obtained. Two of them are obtained from the state 000 at depth 
1, the other two from the state 101 at depth 1, by different inputs 0 and 1 (Fig 3.1). 


The calculations are shown here: 

From state 000 at depth 1 , 

when input is 0, so(2) = Sofl) + 0xh2 = 000 + Oxlll = 000 (3.5) 

when input is 1, S7(2) = so(l) + lxh2 = 000 + lxlll = 111 (3.6) 

From state 101 at depth 1, 

when input is 0, ss(2) = ss(l) + 0xh2 = 101 + Oxlll = 101 (3.7) 

when input is 1, Sz(2) = ss( 1) + 1 xh2 = 101 + lxlll = 010 (3.8) 


The same method is applied repeatedly for the depths from 3 to 7. And the 
number of states remains no more than 8 at these depths. The completely constructed 
trellis is shown in Fig 3.1. 


Starta 

000 

001 

010 

011 

100 

101 
110 
ill 



Fig 3.1 The trellis constructed for a (7,4) Hamming code before expurgation 
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Following step 3, the next step is to remove the nodes and lines that do not end at 
000 state at depth n. The trellis after expurgation is shown in Fig 3.2. 


State 

000 

001 

010 

Oil 

100 

101 

110 

111 


DepthO DeptW Depth2 Decth3 DeptM Depths Depths Depth? 



Fig 3.2 Expurgated trellis for (7,4) Hamming codes 

For cyclic codes, an alternative method can also be used to form the trellis. It is 
built by tracing all the possible states of the storage devices for all possible inputs. The 
number of trellis states at depth i in the expurgated trellis is 2* in the range [1, n-k-1], 

2 < "”* ) in the range [n-k, k], and 2 (n_ * ) while i are in the range of [k, nj. And the trellis 
repeats its pattern in the range [n-k, kj. 

The steps for building the trellis are as follows: 

1) The trellis starts at depth i=0 with the all zero state. 

2) The polynomials at depth i+1 are then formed from the polynomials at depth i in 
accordance with the formula: s/(x; i+1) = (xs/x; i)+ x" r, ) modulo g(x) 

3) Nodes and lines that do not end at all zero state at depth n are removed. 

In polynomial notation, each of the states is represented by a polynomial. 
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3.2 Iterative log-likelihood decoding of binary block codes 

In this part, to show how the decoding algorithm works, we will introduce the log- 
likelihood algebra, the soft in / soft out decoder, the iteration algorithm and some optimal 
and sub-optimal algorithms being used. 


3.2.1 Log-likelihood algebra 

The log-likelihood ratio of a binary random variable u is defined as 


L{u) - log 


/*(« = «,) 

P{u=u 2 ) 


(3.9) 


P(u) here denotes the probability of the random variable u. This ratio is denoted as the 
soft value. The sign of L(u) is the hard decision, and the magnitude is the reliability (soft) 
decision. If the random variable u is conditioned on another random vector, named as y, 
then the conditioned log-likelihood ratio can be described as: 

, , , x . P(«, I y ) 

L(k I y) = log 1 (3.10) 

P(u 2 1 y) 

Note that if the probability P(y) =7, the ratio of that term can be canceled out, the joint 
log likelihood L(u, y) is then equal to the conditioned log-likelihood L (u I y), so from 
equation (3.10), we have. 


Liu I y) = L(u) + L(y I «) = log + fog ^ 


(3.11) 


P(u 2 ) ~P(y\u 2 ) 

The “symbol by symbol” MAP (maximum a posteriori probability) is the optimal 
decoding algorithm [4]. A trellis of finite duration can represent it. The output of a 
“symbol by symbol” MAP decoder is defined as a posteriori log-likelihood ratio for 
transmitted +1 and-1 in the information sequence: 

P(n=+ll y) 


L ( u ) = L(u I y) = log 


P (n = -l I y) 


(3.12) 


Assume the transmission is on an AWGN channel, we will have 


p(yl “ =1)= ^ exp(J V > 

(3.13) 

p(y\u=-l)= exp( 2 } ) 

42kg 2<j 

(3.14) 
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Together with equation (3.1 1), the posteriori log-likelihood ratio of u conditioned on the 
matched filter output y is: 


it \ \ i P(u = + 1) , P(yl«=+1) 
L(u) = L(u I y) = log — — + log ■ 


P(u = -\) 
1 

=l0i M +l0g ^> 


P(yl«=-1) 

2a 2 


P(u = - 1) 


1 


Jlnc 7 


•«*&> 


(3.15) 


= L(u ) + log(exp(4y 12c 2 ))- L(u) + — y = L(u) + L c y 


L(u) is the priori ratio. L c = — is called as the reliability of the channel. In our research, 
we will assume the channel with a constant L c (time- invariant). 


3.2.2 Soft*in / soft-out decoder 

The log likelihood algebra shows that any decoder can be used which accepts soft 
inputs (including a priori values), and delivers soft outputs (made up of three terms, the 
soft channel, the priori input, and the extrinsic value). Any linear binary code in 
systematic form can be used as the component code and the soft-in/ soft-out algorithms 
exist for these codes. Fig 3.3 is a soft-in/ soft-out decoder. 


Input log-likelihoods 


Output log-likelihoods 



Fig 3.3 Soft-in / soft-out decoder 


In Fig 3.3, L(u) represents a priori values for all the information bits, L c y are the 
channel values for all code bits. L t ( u ) represents the extrinsic values for all information 
bits, and L(u) is the soft output, a posteriori values for all information bits. 
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The extrinsic information contains the soft output information from all the other 
coded bits in the code sequence. The L(u) and L c y value of the current bit do not 
influence it. Note that the extrinsic values are used as a priori values only for information 
bits and not for parity bits because codeword probabilities are determined from a priori 
probabilities of the information bits only. 

For systematic codes, we have three independent estimates for the log-likelihood 
ratio. The soft output of the information bit u can be represented by the three additive 
terms: 

L(u) =L c y + L(u) + L e (u) (3.16) 

3.2.3 Iterative decoding algorithm 

Iterative decoding of systematic convolutional codes has been termed as turbo 
coding. However, it can also be used for linear binary systematic block codes. 


Feedback for the next iteration 



4 (“) 


— ► 

L(u) 


Fig 3.4 Iterative decoding procedure with soft-in / soft-out decoders 

For the first iteration, no a priori value exists, thus we can initialize it to be 0. 
After that, the extrinsic values are used as the a priori value of next iteration step as 
shown in Fig 3.4. 

At first, the L-values are statistically independent but after several iterations, 
because they use the same information indirectly over and over again, they will be more 
and more correlated. For the final decision after the last iteration, the last extrinsic pieces 
of information are combined with the received value as the output. 
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The iterations can be controlled by a stop criterion derived from cross entropy, t 
here represents the number of iterations[4]. 


r«)=£ 


* 


expfl £">(», )l) 


< threshold 


(3.17) 


3.2.4 Optimal and sub-optimal algorithms 

As we have mentioned above, the “symbol by symbol” MAP algorithm is the 
optimal method. If we use “symbol by symbol” MAP decoding rule for systematic 
convolutional codes in feedback form with binary trellis, the formula (3.16) can be 
represented as 


£)fV,.y) a ( ._i(/)- (s) 


L{u i ) = L c y iX + L(u t ) + log 


(s\s) 

«,=+! 


2>;‘V, *) •«;_,(/) £ (S) 
(s\s) 


(3.18) 


In which, s and s’ represent the indexes at level i-1 and i respectively. We have the 
forward recursion 


a i ( J ) = X (*'» a i - ^ 

j' 


(3.19) 


and the backward recursion 

s 


(3.20) 


The forward and backward recursion are initialized with cc stan (0)=l, and fi end (0) =1. The 
branch transition probabilities between s’ and s are, 

rl e) (s\ s) = exp(^L f y 1> x.„) (3.21) 

The calculation of actual probabilities can be avoided by using the logarithm of 
the probabilities and the approximation log^ 1 +e u ) ~ max(Z 1 ,I 2 ). This sub-optimal 
realization of the “symbol by symbol” MAP rule is called s Log-MAP rule. It has been 
proved that the performance of the log-MAP algorithm is close to the optimal “symbol 
by symbol” MAP algorithm [4]. 


• ' ! 

:.i 
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The “soft-in/soft-out ” Viterbi Algorithm (SOVA) for systematic convolutional 
codes in feedback form with a binary trellis can also be used. The SOVA output in its 
approximate version has the format: 

LsovA(^) = ky, i + «*, ) + !*(«, > (3.22) 

In which, L t (u, ) is the product of the u i and the first three terms in the formula. 


- Af M L(u t +^-'ZL,y,,x,, w (3.23) 

2 a. v=i 

This method preserves the desired additive structure. Consequently, we subtract the input 
values from the soft output of the SOVA and obtain the extrinsic information to be used 
in the matrices of the succeeding decoder. The extrinsic term is weekly correlated in this 
case. For small memories, the SOVA is about half of the complexity of the Log-MAP 
algorithm. 

When MAP decoding rule is used for linear binary block codes, the branch 
transition probability for systematic block codes with statistically independent 
information bits can be written as 


7, ( s’, s) = P(s\ s’)- P(y, I s’, s) = p(x , , y. ) 

_jp(y/ l<i<k (3.24) 

1 P(y t I*,) Jc + l<i<n 

Also the log likelihood ratio, 

Ux, ly. ) 

fL c y ; +L(u,.) 1 <i<* (3.25) 

\ L c y i Jc + l<i <n 


Thus the soft output of the “symbol by symbol” MAP algorithm for block codes can be 
written as 


2X.cn- A (*) 


Uu i ) = L c y i +L(M,) + log 


<s\s) 


S »«(»■)• Aw 

(jU) 

If the Log-MAP algorithm is used, the formula can be simplified as 


(3.26) 
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'L> g -MAr («, ) = L c y> + £(“, ) + max(log or,., (s’) + log #(s)) - maxCloga,., (s’) + log /?, (j)) 


U. =— I 


(3.27) 


3.3 Implementation of the algorithm 

Two methods are considered to implement the MAP decoding rule for linear 
block codes. One of the methods implements the original code and is closely related to 
the “symbol by symbol” MAP algorithm, the other uses dual code. These two algorithms 
lead to the same result. 

3.3.1 Straightforward implementation 

Omitting the terms which are equal for all transitions from time i-1 to time i and 
using the preceding definition of L (x it yj, the branch transition operation used in (3.19) 
and (3.20) can be written as exp (L (x it y,) x/2), so (3.16) can be described as 

X II ex P( L <*> . yj )*, / 2 ) 

xeC y=l 

L(u t ) = L c y i + L{Uj ) + log 2—ti (3.28) 

XIl ex P iUx r yj)Xj/2) 

xeC 7=1 

«.=-! 7#j 

This equation separates the codewords in two groups. One with all the codewords 
having a “+1” at the i,h position, the other with all the codewords having a “-1 ” at the kth 
position. 

This separation can be implemented into trellis by small changes in the 
construction principle. In general, i different trellises are constructed to obtain the soft 
output L(«, ) for all information bits. 

The trellis is built by using all the columns of the H matrix excluding the i th one, 
and additionally by storing every path ending at time n at the state S„ =hi. 

S e ndi=0 and S em t 2 =hi are two possible ending states. The time steps in the trellis 
are named after the corresponding column of the H matrix, thus the i,h time instant will 
not appear in the trellis any longer. 
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The paths ending in the zero state S en di represent the codewords with a “+1” at the 
ith position. The paths ending in the state S en d 2 represent the codewords with a “-1” at the 
position. For the class of cyclic codes the trellises for the different information bits are 
obtained by simply shifting the indices. 


3.3.2 Dual code implementation 

If n - k < k, dual code will have fewer codewords than the original code. So, 
under such situation, the use of dual code will result in the reduction of the decoding 
complexity. The dual code C’ can be presented as a trellis with at most 2* states. 

The forward recursion can be written as: a i ( s ) = y, (s', s ) -a M ( s') (3.29) 

s' 

The backward recursion: (s') = y i (s', s ) • ft ,( s') (3.30) 


The recursions are initialized with <5 0 (0) = 1 , and (0) = 1 . The branch transition 
probabilities between the states s, s’ are defined here as, 

?i (s\ s) = (tanh(L(x, ; y, ) / 2)) (1 “ 1 ' V2 (3.31) 


Two methods are used to implement “symbol by symbol” MAP rule using the 
dual code. Method 1 builds up the full trellis for the dual codes and implements one 
forward and one backward recursion. The soft output for each information bit is 
calculated by the formula. 


S«h(>')Aw 


Uu i ) = L c y i + L(iij ) + log 


(*’.*) 

1 


X 5 ,-i co • & is) - X«,-i (^’) • Pi w 


(3.32) 


(s\s) 

Xj ^=+1 X, ^-1 

Method 2 is to construct the modified trellis for the dual codewords to perform one 
forward recursion for each information bit. The soft output can be written as: 

L(u, ) = L c y, + L(Uj ) + 2 ar tanh(5„ (S end2 )/ a H (S endl )) (3.33) 

Dual code of a cyclic code is still a cyclic code. So the modified trellis for every 
information symbol can still be built one from the other by simply shifting the indices. 
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3 33 A decoding example by using straight-forward implementation 

For convenience, here we still choose the (7, 4) Hamming code and the same 


systematic H matrix as in 3.1.2. 


H = 


1110 10 0 
0 1110 10 
110 10 0 1 


(3.2) 


Thus we will have a corresponding G matrix (see in Section 1) as: 

fl 0 0 0 1 0 f 

_ 0 1 0 0 1 1 1 

” 0 0 1 0 1 10 
0 0 0 1 0 1 1 


(3.34) 


Assume we have the information bits u- 1, 1, 0, 1, then the codeword is 


v = u ■ G = [1 


1 0 !]• 


1 0 0 0 1 0 1 
0 10 0 111 
0 0 10 1 10 
0 0 0 1 0 1 1 


= [1101001] 


(3.35) 

The first four bits in v are information bits, and the last three bits are parity bits. 

In BPSK transmission, we actually transmit the signal sequence as [1, 1 -1, 1, -1, 
-1,1 J. The signals received are simulated using SPW software. In the simulation, in 
order to make L c = ./(for the simplification of calculation), we make the variance of the 
noise to be 2. The simulated results are y=[1.6 2.7 -1.2 -0.8 -0.6 0.081.1 ]. 

If the hard decoding is applied directly to these received signals, we will have a 
sequence [ 1 1 0 0 0 1 1]. 2 errors have occurred, one in the fourth information bit and 
second in the second parity bit. Though (7, 4) Hamming code can detect two errors, it can 
only correct one error, so the normal decoding failed. 

Now we will see how our iterative block decoding works step by step, and what it 
can do to decrease the error probability. 



.1 


1 

' I 
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333.1 Constructing trellis for information bits 

From the discussion in section 3.3.1, we know that four trellises should be built, 
one for each of the information bit locations. And from each trellis, a soft output L(u j ) 


should be obtained. 

We build the trellis for the first information bit as an example. The trellis should 
be built following these rules: 

1) The trellis is built by using all the columns of H matrix except the i ,h one. 

2) There are two ending states of the trellis, S en di represents the codewords with a “+1” 
at i th position, S en d 2 represents the codewords with a “-1” at i t h position. 

3) The trellises of cyclic codes are obtained by simply shifting the H matrix. 

For information bit location 1, to build a trellis that satisfies the above rules, we 
first build a trellis by shifting to the left each column of the H matrix and the first column 
becomes the last column. 


H 


SHIFTl ~ 


1 

1 

1 


1 

1 

0 


0 

1 

1 


10 0 1 
0 10 0 
0 0 11 


(3.36) 


The expurgated trellis constructed (same method as in 3.1.2) is shown in Fig 3.5. 


State 

000 

001 
010 
Oil 

100 

101 

110 

111 


DepthO Depthl Depth2 Depth3 Depth4 Depths Depth6 Depth7 



Fig3.5 Full trellis for first information bit location. 
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Slate 

000 

001 

010 

011 

100 

101 

110 

111 

Fig 3.6 The final trellis with two ending states for first information bit location. 

This trellis is not the final trellis we want yet. In the decoding system, we only use 
the trellis between depth 0 and depth 6. Thus we will have the final trellis with two 
ending states as in Fig 3.6. 

For information bit location 2, the same method can be applied. First a full trellis 
is built by using the shifted matrix H, 

'1 0 1 0 0 1 f 

H , SHIFT 2 101001 (3.37) 

0 10 0 111 

Then, the part of trellis between depth 6 and depth 7 is discarded. (Fig 3.7) 



Fig 3.7 The final trellis with two ending states for information bit location 2 
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We see two different structured trellises for information bit location 1 and 
information bit location 2. Trellises for information bit location 3 and information bit 
location 4 can be constructed from further shifting of H matrix. 

3.33.2 The decoding system 

Fig 3.8 shows the basic decoding system for straightforward implementation. 



teration 


Fig3.8 The decoding system of (7,4) Hamming code while working on information bit 1 

In this system, L(y k \x k )=L c y is the input to the system. L(uJ is added to 
information bits but not to the parity bits. All the information and parity bits are stored in 
the seven buffers. While the system works for information bit 1, the system calculates the 
extrinsic value by using the trellis we have constructed for the bit. And this L e (u l ) can 

be feedback to L(uj) as the priori value of the next iteration. At last, the extrinsic value is 
added to the L(yi- xj) as the final soft output L(u, ) . 
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For information bit 2, the system shifts the buffers and changes the corresponding 
trellis. Fig 3.9 shows the system while working for information bit 2. 



Fig3.9 The decoding system of the (7,4) Hamming code while working 

on information bit 2 


S-3-3.3 Calculation of extrinsic value from the constructed trellis 

The calculation of extrinsic value is done by using (3.28). We go through the 
trellis for information bit location 1 to show the steps to calculate L t («, ) . We separate 

the calculation in (3.28) for each depth in the trellis. The equation for each depth can be 
described as 

log a j ( s x ) = \og(a H ( j, ) exp(-L(x, ,y.)/ 2) + a H (s 2 ) exp (L( Xj ,y ; )/ 2)) (j*) 

(3.38) 

Between depth j-1 and depth j, the exp(~L(xj, yj)/2) used in the equation is the branch 
transition operation being used form state s, to state s, t while input is 0. The exp(L(xj, 
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yj)/2) used in the equation is the branch transition operation being used form state s 2 to 
state sj, while input is 1. 

Assume L(u)=0 before the first iteration. From (3.25) and the simulated received 
signals, we can calculate the value of each L (xj, yj) as follows, 

L (xj, yi)-1.6 L(x 2 ,y 2 )=2.7 Ux 3 , y 3 )=-1.2 L(x 4 , y 4 )=-0.8 

L (x 5 , y s )=-0.6 L (x 6 , y 6 )=0.08 L( x 7 , y 7 )=l-l 

(3.39) 

Now we start from the beginning of the trellis (Fig 3.6). 

Depth 0: 

We set an initial condition cto(0)=l. 

Depth 1: 

From depth 0 to depth 1, the input is L(x 2 , y 2 )=2.7, so exp (-L( x 2 , y 2 )/2)=0.26 and 
exp(L(x 2 ,y 2 )/2)=3.9. There are 2 states at depth 1, and each state has only one previous 
state. 

For state 0, <Xj(0)= Cto(0) exp (-L (x 2 , y 2 ) / 2)=0.26; 

For state 7, aj(7)= cco(0) exp (L (x 2 , y 2 )/2)=3.9; 

(3.40) 

Depth 2: 

There are four states at depth 2, and each state still has only one previous stage. 
Calculating by the same method for depth 1, and have Lfx 3 , y 3 )=-1.2 , we get, 
a 2 (0)= ccj(O) exp (-L (x 3 , y 3 ) /2)=0.26xl. 82=0.47; 
a 2 (l)= OCi(7) exp (L (x 3 , y 3 )/2 )=3.9x0.55=2.15; 
a 2 (6)= oc,(0) exp (L (x 3 , y 3 )/2 )=0.26x0.55=0.14; 
cc 2 (7)= 0Cj(7) exp (-L (x 3 , y 3 )/2 )=3.9xl.82=7.10; 

(3.41) 

Depth 3: 

8 states, each state with one previous state, L(x 4 , y 4 )=-0.8, we can get, 
a 3 (0)=0.47xl.5=0. 71 a 3 (l )=2. 15x1.5=3.23 

a 3 (2)=2.1 5x0.67=1. 44 a 3 (3)=0.47x0.67=0.31 


a 3 (4)= 7. 10x0.67=4. 76 a 3 (5)=0. 14x0.67=0.09 
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ccs( 6)=0. 14x1.5=0.21 


a^7)=7. 10x1. 5=10.65 


(3.42) 

Depth 4: 

8 states, each state with 2 previous states, L(xs, ys) = -0.6, we get, 
a<(0)= <Xi(0) exp ( -L (x 4 , y 4 ) /2)+ a 3 (4) exp (L (x 4 , y 4 ) /2)=0. 71x1.35+4. 76x0. 74=4.48 
0 ( 4 ( 1 )= 0 ( 3 ( 1 ) exp (-L (x 4 , y 4 )/2)+ c&(5) exp (L (x 4 , y 4 ) /2)=3.23xl. 35+0.09x0.74=4.43 
0 ( 4 ( 2 )= 0 ( 3 ( 2 ) exp (-L (x 4 , y 4 ) /2)+ 0 ( 3 ( 6 ) exp (L (x 4 , y 4 )/2)=l. 44x1. 35+0.2 1x0.74=2. 10 
0 ( 4 ( 3 )= 0 ( 3 ( 3 ) exp (-L (x 4 , y 4 )/2)+ a 3 (7) exp (L (x 4 , y 4 )/2)=0.3 1x1. 35+ 10.65x0.74=8.30 
04 ( 4 )= 0 ( 3 ( 4 ) exp (-L (x 4 , y 4 )/2)+ 0 ( 3 ( 0 ) exp (L (x 4 , y 4 ) /2)=4. 76x1.35+0. 71x0. 74=6.96 
0 ( 4 ( 5 )= 0 ( 3 ( 5 ) exp (-L (x 4 , y 4 )/2)+ a 3 (l) exp (L (x 4 , y 4 ) /2)=0.09xl . 35+3.23x0.74=2.51 
0 ( 4 ( 6 )= a 3 ( 6 ) exp (-L (x* y 4 )/2)+ a 3 (2) exp (L (u, y 4 ) /2)=0.2 1x1. 35+ 1.44x0.74= 1.35 
0 ( 4 ( 7 )= 0 ( 3 ( 7 ) exp (-L (x 4 , y 4 )/2)+ a 3 (3) exp (L (x 4 , y 4 )/2)= 10.65x1. 35+0.3 1x0.74= 14.61 

(3.43) 

Depth 5: 

4 states, each state with 2 previous states, L(x i $, y^) = 0.08, we get, 
a s (0)= 0 ( 4 ( 0 ) exp (-L (x 5 . y s )/2)+ a,(2) exp (L (x 5 , y s )/2)=4.48x0.96+2. 10x1. 04=6.49 
a 5 (l)= 0 ( 4 ( 1 ) exp ( -L (x 5 . y 5 ) /2)+ 0 ( 4 ( 3 ) exp (L (x 5 , y 5 )/2)=4.43x0.96+8.30xl. 04=12.89 
Os(4)= 0 ( 4 ( 4 ) exp (-L (x 5 , y s )/2)+ 0 ( 4 ( 6 ) exp (L (x 5 , y 5 ) /2)=6.96x0.96+ 1.35x1. 04=8.08 
0(5(5)= 0 ( 4 ( 5 ) exp (-L (x 5 , ys)/2)+ 0 ( 4 ( 7 ) exp (L (x 5 , y 5 ) /2)=2.51x0.96+14.61xl. 04=1 7.60 

(3.44) 

Depth 6: 

2 states as ending states of the trellis, each state with 2 previous states, 14 x 7 , y 7 ) = 
1 . 1 , we get, 

0(6(0)= a 5 (0) exp (-L (xe, y 6 ) /2)+ a 5 (l) exp (L (x 6 , y 6 ) /2)=6.49x0.57+ 12.89x1. 73=26.00 
06(5)= a s (5) exp (-L (xe, y 6 )/2)+ a s (4) exp (L ( X6 , y 6 )/2)= 17.60x0.57+8.08x1. 73=24.01 

(3.45) 

Finally, we obtain the extrinsic value L e (u l ) =log26.00 -log24.0 1=0.03. 

(3.46) 
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3 .3.3.4. Simulation result 

We have mentioned above that, without the iterative log-likelihood decoding, 2 
bits are in error and the decoding of Hamming codes failed. Using the program of Guo in 
which the iterative log-likelihood algorithm is applied, we find that the error in 
information bit 4 has been corrected. 
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